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Foreword 


H OW CAN WE COMPREHEND the mysteries of the irrational? Irrational 
numbers are almost as old as mathematics. The School of Pythagoras in 
the 5th century BC believed that “all is number” and rational. It is reported 
that the Pythagoreans were at sea when the discovery of incommensurable 
ratios was made and that the discoverer was thrown overboard for violating 
their code. Geometric construction problems, such as squaring the circle and 
trisecting angles, led to more questions about algebraic equations. Going even 
further, transcendental numbers beyond the power of algebraic numbers, were 
recognised by Euler in the 18th century. 


Irrationality and Transcendence in Number Theory deals with the clas- 
sification of numbers from the time of Pythagoras to the present day. The 
study of irrational and transcendental numbers has been marked by seem- 
ingly impossible problems and striking ideas whose ripples spread far beyond 
the realm of numbers. I recall the excitement about one recent breakthrough. 
In 1966, Alan Baker made a major discovery in transcendental number theory. 
He applied it to obtain a new class of transcendental numbers and went on to 
develop quantitative versions and solve many classical diophantine equations. 
Alan Baker visited Australia and lectured on his work. He was a master of 
precision and elegance. 


Kurt Mahler was one of the first distinguished mathematicians invited to 
start the new research institute at the Australian National University in the 
1960s. I was fortunate to hear Mahler expound Alan Baker’s work at the 
Australian Mathematical Society Summer Research Institute held at the Uni- 
versity of Tasmania in 1970. The lectures were called simply, “A theorem of 
A. Baker”. Mahler was a meticulous lecturer. He started and finished each 
lecture with extraordinary punctuality. As he spoke, he wrote a succinct ex- 
position on the blackboard, with the key points neatly placed in order in his 
characteristic rectangular boxes. His enthusiasm for mathematics inspired a 
generation of number theorists in Australia, led by Alf van der Poorten. I 
joined this group when I took up a lectureship at the University of New South 
Wales in 1972, and at that time, I was introduced properly to transcendence 
and Mahler’s method. 


Irrationality and Transcendence in Number Theory tells the story from its 
origins in the discovery of irrational numbers to the ideas behind the work of 
Baker and Mahler on transcendence. The story focuses on important themes 
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involving irrationality, algebraic and transcendental numbers, continued frac- 
tions and Diophantine approximation. These topics make an excellent intro- 
duction to modern number theory for advanced undergraduates and early 
postgraduates. 


The book has some unusual and valuable features. Transcending the def- 
initions, theorems and proofs, great care is taken to explain the ideas and 
illustrate them with examples and well-chosen exercises. This enriches the 
story, and draws in some material which is not found in more traditional 
didactic approaches. 


One of the themes is Hermite’s method. Hermite’s work in the nineteenth 
century introduced so-called auxiliary functions. Hermite used his method 
to prove that e is transcendental, paving the way for further progress by 
Lindemann, Gelfond, Schneider and, eventually, Baker. Hermite’s method for 
proving irrationality and transcendence is discussed in two chapters, and the 
irrationality and transcendence of e and 7 and much more are demonstrated. 
The treatment explains some of the mystery behind the proofs and shows how 
the ideas developed. 


Continued fractions and their approximation properties are developed with 
a little help from Sherlock Holmes. Again, there are some unusual additions. 
There is a sketch of Apéry’s miraculous proof that ¢(3) is irrational, first 
announced in 1978. In addition, there is an account of generalised continued 
fractions, illustrated by Lambert’s proof that 7 is irrational and more. 


A final theme revolves around transcendence and computability. This 
theme is a tribute to Mahler’s method in transcendence theory, and the ac- 
count describes the examples that inspired Mahler to develop his method in 
the 1930s. Later, Mahler’s examples were linked to finite automata, generating 
a new field of study on the computability of decimal expansions. Taking these 
ideas further, Adamczewski and Bugeaud [1] proved in 2007 that the decimal 
expansion of an algebraic irrational cannot be generated by a finite automa- 
ton. Despite further remarkable recent progress in 2018 by Adamczewski and 
Faverjon based on Mahler’s method, some simple problems still seem out of 
reach. For example, how can one approach the widely held belief that the ex- 
pansions of a number in base 2 and base 10 should have no common structure? 


I warmly recommend Irrationality and Transcendence in Number Theory 
as a guide and introduction to number theory. It leads the reader through 
developments in number theory from ancient to modern times and contains 
plenty of exercises for practice and problems for exploration. It draws on a 
wide variety of mathematical techniques, briefly summarised in appendices. 
The book adopts the tone of a kindly teacher but one not afraid to challenge. 
It will prepare the reader to appreciate the recent breakthroughs in Baker’s 
method and Mahler’s method and tackle the mysteries of the irrational. 


John Loxton 
13 June 2021 


Preface 


] RRATIONALITY AND TRANSCENDENCE is a study whose roots go back to 
about 500 BC, when Pythagoras or one of his followers proved that, contrary 
to “common sense”, certain ratios of geometric quantities cannot be expressed 
as fractions in whole numbers. While the Ancient Greeks succeeded in proving 
various surd expressions to be irrational, little further progress was made until 
the eighteenth century, when Euler and Lambert proved the irrationality of e, 
am and related numbers. We look first at more modern proofs of these results, 
deferring Lambert’s work until later. 


The question of whether a number is algebraic or transcendental — that 
is, whether it is or is not a root of a polynomial with integer coefficients — is 
deeper and harder than that of irrationality. After giving a survey of the basic 
ideas regarding algebraic numbers, we shall prove the existence of transcen- 
dentals, firstly (following Cantor) without exhibiting any particular example! 
The simplest approach to showing that a specific number is transcendental is 
to study its approximations by rational numbers; continued fractions provide 
an important tool for doing so. Taking another look at e and 7, we shall adapt 
Hermite’s method to prove the transcendence of these numbers. 


A (relatively) recent and fascinating topic connects transcendence with 
deterministic finite automata, a kind of very elementary theoretical comput- 
ing device. Ideas concerning such automata can be used to investigate the 
transcendence of numbers that display some sort of “pattern” in their decimal 
expansions or continued fractions. 


The principal aim in writing this book was to present some of the most 
fundamental techniques for proving numbers irrational or transcendental to 
students in their later undergraduate or early postgraduate years. These 
techniques range from the earliest ideas, which may be described as purely 
numerical (actually, in their original version, geometrical), through to the 
calculus—based approaches of the nineteenth and twentieth centuries and the 
mid-to-late twentieth century links with automata theory. I hope that the 
book will also communicate to readers the delightful, elegant and often sur- 
prising nature of many aspects of the subject, and that the expositions given 
before the formal proofs of some of the harder results will provide a flavour of 
the process of mathematical discovery, which is invariably more exciting than 
the mere contemplation of a polished answer. 
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PREREQUISITES FOR READING THIS BOOK 


One of the most fascinating aspects of this subject is that it uses techniques 
from widely diverse areas of mathematics: number theory, calculus, set theory, 
complex analysis, linear algebra, order structures, and the theory of compu- 
tation will all be touched upon. I am firmly convinced that it is not only 
possible, but appropriate, for readers without any specialist background in 
these areas to take them on trust in studying irrationality and transcendence. 
The necessary details have been provided in summary form as appendices to 
each chapter. For readers who have already studied these topics, the appen- 
dices should, in most cases, serve as a reminder, or a checklist, of what prior 
knowledge is assumed. Those who have not should carefully study the facts 
listed and, in particular, should ensure that they understand both the assump- 
tions and the conclusion of all theorems given. In the appendices, proofs are 
provided only in two circumstances: where the specific result needed, though 
implicit in references cited, may be hard to locate explicitly; and where I 
could not bring myself to omit an attractive and illuminating line of argu- 
ment. In some cases, the knowledge set out in the appendices exceeds what 
the reader requires, and is provided as an encouragement for further study by 
those readers who so wish. 


The content of the book was originally developed as a set of lecture notes 
for honours level (fourth-year undergraduate) studies in mathematics at the 
University of New South Wales, Sydney, Australia. It was expected that most 
students would be largely familiar with the necessary prior knowledge (with 
the possible exception of the theory of computation) and would only need the 
appendices as a brief refresher. However, talented and enthusiastic students 
in their third year, and even some exceptional students in their second year, 
occasionally took the course, and these were able without excessive difficulty 
to pick up what they needed from the appendices. For the convenience of the 
reader, we give some further details. 


e Basic number theory — divisibility, primes, modular arithmetic. It 
would be wise for readers to ensure that they are familiar with the 
material in the appendix to Chapter 1. The Prime Number Theorem is 
mentioned once in Chapter 3. 


e We frequently use the fundamental calculus operations of differentia- 
tion and integration, notably including integration by parts and estima- 
tion of integrals. It is necessary to understand the relative sizes of n*, b” 
and n! for large n. One important proof uses the Mean Value Theorem. 


e Elementary set theory is really only employed in its capacity as the 
fundamental language of mathematics. In a couple of places, the idea of 
a countable set is used: the reader who understands this as a set that 
can be listed in a finite or infinite sequence, and who knows (or is willing 
to accept) that the set of complex numbers is not countable, need go no 
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further. For those who would be interested in a somewhat more formal 
approach, additional details are provided in appendices 3.1 and 4.3. 


e Complex analysis is used in proving the transcendence of 7, sec- 
tion 5.2; however, little is required except the integration of analytic 
functions, which is effectively identical with the integration of real func- 
tions. A small number of results concerning Taylor series and analytic 
functions are used in Chapter 6. 


e A small amount of linear algebra and group theory is needed in 
Chapter 3 to prove certain essential results about algebraic numbers 
and algebraic integers. Exceptionally, an appendix is not provided in 
this case, as the amount of background required is too great to fit within 
a brief account. Many readers, however, will have met vector spaces and 
linear algebra in their studies and should have little difficulty in following 
our arguments. If necessary, the proofs in section 3.1 may be passed over, 
as long as the results, which are entirely comprehensible without linear 
algebra, are read and understood. 


e Some basic properties of order structures are used in proving Linde- 
mann’s Theorem and are explained in appendix 5.3. 


e Chapter 6 relates certain ideas in the theory of computation to tran- 
scendence proofs. This is an area that is very much less likely to form part 
of an undergraduate mathematics curriculum than those listed above. A 
couple of formal definitions are given in appendix 6.1 (mainly because I 
find them interesting); however, for the purposes of the main text, the 
informal ideas given at the start of section 6.1 are amply sufficient and 
should be easily absorbed by readers. 


e Finally, a few very short appendices contain certain elementary facts 
about inequalities, solutions of simultaneous equations and other such 
matters. 


Besides this, it is expected that readers will have a basic familiarity with the 
aims and ideas, and some of the essential techniques of mathematical proof. 
In particular, proof by contradiction will frequently be important. This is 
too vast, and too independent, a subject to be combined with the present 
work, and readers who need to improve their background in this area are 
cordially recommended to the texts by Polya [52], Franklin and Daoud [26] 
and Solow [60] which are listed in the bibliography, or to the many other works 
which have been published recently in this field. 


In writing this book, I have striven in many places to simulate the tone of 
a class discussion. I trust that my expositions of the thinking behind certain 
difficult arguments, which in some cases almost amount to proving the same 
result twice, will not be attributed to the woolliness of the author’s mind 
or the flabbiness of his prose, but will help readers to attain an intuitive 
understanding of the subject which may not always be provided by a text 
which confines itself to a strict and unvarying alternation of theorem and proof. 
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For similar reasons, I have not always kept my eye firmly on the principal 
aims of this book but have availed myself of frequent opportunities to go off 
on tangents. Some of these asides provide historical background; some are 
discursions into other areas of mathematics. I make no apology for this; in my 
view, a higher-level undergraduate course should seek not merely to deepen 
but also to broaden students’ mathematical culture and education. 


Each chapter contains exercises to help readers develop their skills fur- 
ther, including many which, I hope, will be found entertaining and thought— 
provoking. I have not given a great number of “drill” exercises, since readers 
should be able to provide their own where needed. For instance, the first exer- 
cise in Chapter 4 asks for the continued fraction of just one rational number: 
I feel confident that readers who need more practice in this technique will 
recognise the fact and will be able to select their own examples. 
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CHAPTER I 


Introduction 


It can be of no practical use 
to know that 7 is irrational, but if we can know, 
it surely would be intolerable not to know. 


E.C. Titchmarsh 


Die ganzen Zahlen hat der liebe Gott gemacht, 
alles andere ist Menschenwerk. 


Leopold Kronecker 


T HOUGH THE ORIGINS of number concepts must largely be a matter of 
speculation, it seems clear that people must from a very early period have 
seen the necessity of counting collections of objects. It seems clear also that 
some kind of numerical abstraction of the counting numbers (the concept of 
“five”, rather than “five fingers”, “five sheep” or “five days”) must have been 
understood a long time ago. Probably the next development of numerical 
technique would have been connected with division of a quantity into equal 
parts and would have led to the use of fractions. (Zero and negative numbers 
did not appear until many centuries later.) Early Greek geometers seem to 
have assumed that any two line segments are commensurable: that is, that a 
“common measure” can always be found such that each of two given segments 
is an integral multiple of the common measure. This is, in effect, to assume 
that the ratio of lengths of line segments is always a rational number. In the 
time of Pythagoras (ca. 570-490 BC) it was discovered that this assumption 
is invalid in the case where the segments in question are the side and diagonal 
of a square. This discovery had various consequences: one, the development of 
an improved theory of proportion (Eudoxus, ca. 408-355 BC) which applied 
equally to incommensurable and to commensurable lines; another, the study 
of irrational numbers. 


Assuming that the positive integers 1,2,3,... are known we can construct 
the (signed) integers, the rationals, the reals and the complex numbers. Each 
construction may be based on the previously known numbers, and at every 
stage we should prove that the necessary algebraic laws hold (including some 
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that we already know to hold in the earlier cases). Indeed, we can go back 
even further than this and define the positive integers in terms of set theory, 
here also taking care to prove the essential properties of the integers. For the 
present topic, however, this development is of no importance and we shall work 
the other way around: we shall assume that we know everything(!) about the 
complex numbers C, and shall then define rational and irrational numbers as 
subsets of C. 


Definition 1.1. A rational number is a number of the form p/q, where p 
and q are integers and q is not zero. An irrational number is any (complex) 
number which is not rational. 


It is well known that any rational number can be written uniquely in the 
form p/q, where p and q are integers with no common factor and q is positive. 
We shall frequently assume when we refer to a rational number p/q that 
these properties already hold. Any non-real complex number is necessarily 
irrational; for this reason we shall initially concern ourselves mainly with real 
numbers. However, it turns out that some problems involving real numbers 
can be significantly simplified by writing them in terms of complex numbers: 
an example will be seen in exercise 1.12 at the end of this chapter. Moreover, 
in Chapter 3 we shall need to consider complex numbers in connection with 
a further subdivision of the irrational numbers. 


1.1. IRRATIONAL SURDS 


The following result is well known, and was, essentially, proved by Pythagoras 
or one of his followers. 


Theorem 1.1. 2 is irrational. 


Proof by contradiction. Suppose that 2 = p/q, where p and q are integers 
with no common factor, and with g € 0. Squaring both sides and multiplying 
by q?, we have p? = 2q?. Thus p? is even and so p is even, say p = 2r. 
Substituting for p gives q? = 2r? and so q is even. Thus p and q have a 
common factor of 2, and this contradicts our initial assumption. Therefore, 
\/2 is irrational. 


Plato records that his teacher Theodorus proved the irrationality of \/n 
for n up to 17. Historians of mathematics have wondered why he stopped 
just here; the question is made harder by the fact that we don’t know exactly 
how Theodorus’ proof ran. The following proof of the irrationality of \/n for 
certain values of n suggests a possible reason for stopping just before n = 17. 

First, if n = 4k, then the irrationality of ./n is equivalent to that of Vk; 
and if n = 4k + 2, then the method used above for n = 2 can be employed 
with only minor changes. So we concentrate on odd values of n. If n is odd 
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and /n = p/q, then nq? = p? and p and q must both be odd; substituting 
p=2r+1and q=2s-+1 and rearranging yields 


An(s* +s) —4(r? +r) +n—-1=0. 
Consider the case n = 4k + 3. Cancelling 2 from the above equation gives 
Qn(s? +s) —2(r?+r)+2k+1=0, 


which is clearly impossible as the left-hand side is odd. This method does not 
work directly for n = 4k + 1, so we consider as a subsidiary case n = 8k +5. 
Substituting as above and cancelling 4 we obtain 


n(s*+s)—(r?+r)+2k+1=0; 


but as r? +r and s? + s are both even, this is again impossible. 


The remaining possibility is that n = 8k +1; but it appears that this case 
has to be split up into still further subcases, and the proof becomes much 
more complicated (try it!), so we shall stop here. Therefore, we have proved 
the following. 


Theorem 1.2. [fn is not of the form 4k or 8k +1, then ./n is irrational. If 
n = 4k, then s/n is irrational if and only if Vk is irrational. 


How far can we get using this result? We have settled every case except 
for n = 4™(8k +1) = 1,4,9, 16,17, 25,33,.... Of these, the first four obvi- 
ously have rational square roots, and so the smallest undecided case is 17. 
Thus, if the above was Theodorus’ method of proof, it would have been quite 
reasonable for him to stop before reaching n = 17. Hardy and Wright [29], 
section 4.5, comment further that this proof “[depends] essentially on the 
distinction between odd and even, a matter of great importance in Greek 
mathematics”. 


Comment. The working in the above proof can be simplified somewhat by 
using modular arithmetic; but this concept was not available to the ancient 
Greeks, being introduced into number theory by Gauss, some 24 centuries 
after the time of Pythagoras. 


Our first proof of the irrationality of 2 can be slightly reorganised in a 
way which admits an important generalisation. 


Lemma 1.3. The Rational Roots Lemma. If a rational number p/q, where p 
and q have no common factors, is a root of the polynomial equation 


Anz” +dn—12" 1} +++ Fayz+a9 =0 


with integer coefficients ag,41,---;@n—1,4n, then p is a factor of ag and q is 
a factor of an. 
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Proof. Substituting z = p/q, multiplying by g” and rearranging, we have 


=1 -1 
Gnp” = —Qn_1p™ ~q—--+— apg” ~ — aoq”™ 


= —q(an—1p”* +--+ + aipq”~? + aog”") . 


Hence q | a,p"; but g and p have no common factor, and so q | ay. Similarly 
Pp | Qo. 

Corollary 1.4. Another proof of the irrationality of /2. Suppose, on the 
contrary, that 2 = p/q. Noting that 2 is a root of z? — 2 = 0 and applying 
the above lemma, we have q = 1 and so V2 is an integer. But since 1 < /2 <2 
this is clearly not true. 


Examples. 


e Let f(z) = 3234 427 + 52+6. Suppose that p/q is a rational root of f, 
with p and q coprime. Then p | 6 and q | 3; without loss of generality q¢ 
is positive, so the possibilities are 
12 1 2 

“3 3 3 BH 

It is clear that f has no positive roots, and a bit of calculation also 


eliminates the negative numbers from the above list. So f has no rational 
roots. 


e Let f(z) = 16823 — 133z + 275. If we use the above approach we will 
have 192 potential roots to check! (exercise: confirm this). However, if we 
apply the method rather than the result of the Rational Roots Lemma, 
a little ingenuity will give a rapid solution. Suppose that f has a root 
p/q, where p and q are relatively prime integers. Then 


Bei 8G 21.29 ot AG 
qd 


168p* — 133pq? + 275q7 = 0. 
We have 


o 7| 168 and 7| 133 and 7+ 275, so 7 | q; 
o q’ | 168p? and p?, q? have no common factor, so q? | 168; 


therefore 49 | 168, which is not true. Hence f has no rational roots. 


Definition 1.2. A polynomial is said to be monic if its leading coefficient 
(the coefficient in the term of highest degree) is 1. 


Theorem 1.5. Roots of monic polynomials. Let a be a root of a monic poly- 
nomial with integer coefficients. Then a is either integral or irrational. 


Proof. Suppose that a satisfies 
Anz” +An—12" 1 +++: +a,z+a9 =0 


with a, = 1 and that a = p/q. By the lemma we have q | a,, sog = 1 anda 
is an integer. 
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Definition 1.3. Any root of a monic polynomial with integer coefficients is 
called an algebraic integer. 


Notes. 


e Any “ordinary” integer n € Z is an algebraic integer, as it is a root of 
the monic polynomial equation z — n = 0. 


e If we want to emphasize the difference between algebraic integers and 
“ordinary” integers, we shall refer to the latter as rational integers. 
The term is justified by the following result. 


Theorem 1.6. A complex number is a rational integer if and only if it 
is both rational and an (algebraic) integer. 


Proof. The forward implication is clear; the converse is just a restate- 
ment of the previous theorem. 


e If a is a complex number such that 
An” + an" | +++» +ayat+ap = 0 
for some rational integers ao, @1,..-,@n—1, An, not all zero, (such a num- 
ber is called algebraic), then a,,a is an algebraic integer. 


e If a and £ are algebraic integers, then so are a+ 0, a— 6 and af. 


e If ao, a1,...,@n—1 are algebraic integers and ( is a root of 
2” + apie 1 +++ + az +a, 


then @ is an algebraic integer. In particular, if n is a positive rational 
integer and a is an algebraic integer, then any (possibly complex) value 
of */a is an algebraic integer. 
These results (which we shall prove later) make it easy to show that various 
simple expressions consisting of radicals are irrational. For example, J/29—/23 
is the difference of two algebraic integers, and therefore is either integral or 
irrational; but 


6 6 
04 4/99 4/8 = eS" 
/29+./23 5+4 


so ¥29 — 23 cannot be an integer and must be irrational. For a slightly 
harder example, consider the polynomial 


p(z) = 227-2274 742-6. 


The derivative p!(z) = 32? — 2/2z+ W4 is a quadratic with discriminant 


A =)? —4ac = 8-124, 
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which is negative; so p'(z) is always positive, p(z) is always increasing, and 
p(z) has just one real root a. It is easy to see that the coefficients of p are 
algebraic integers, and so by the last property on the previous page a is also 
an algebraic integer. A little thought shows that 


pil) =1-V24+ V4- V6<1-$+3-4=0, 
p(2) =8-—4V2+204— 76 >8-8+2-2=0 


and so 1 < a < 2. Since a is an algebraic integer but not a rational integer, it 
is irrational. 


1.2 IRRATIONAL DECIMALS 


The following well-known result characterises rational numbers in terms of 
their decimals. Note that the eventually periodic decimal expansions include 
the finite expansions, for instance, 0.123 = 0.123000--- = 0.122999... . 


Theorem 1.7. Rationality of decimals. A real number a is rational if and 
only if it has an eventually periodic decimal expansion. 


Proof. Firstly, suppose that a has an eventually periodic expansion. Without 
loss of generality we may assume that 0 < a < 1, say 


a= 0.aya2 sais agb be a by by be sae bib, be SS og 


Let a and b be the non-negative integers with digits aja2---a, and byb2--- by 
respectively; then 


_ a i b 2 b ro a i, b 1 
~ oe "aoe "19s ~ 108 108+# 1— 10-8’ 


a 


which is rational. Conversely, suppose that a@ = p/q is rational, and initially 
assume that neither 2 nor 5 is a factor of g. Choose t = ¢(q), where ¢ is 
Euler’s function: see definition 1.6 in the appendix to this chapter. By Euler’s 
Theorem we have 


10'=1 (mod q) 


and so q is a factor of 10é — 1, say 10 — 1 = qr. Hence we can write 


pr ge 


b 
= Gea (grad 


here we have used the division algorithm to guarantee that 0 < b < 10° —1. 
We can thus write b as a number of ¢ digits, say b = bib2--- by; it is possible 
that b; is zero. Similarly, write a = a,a2---a,. Then 

b 


b 
@= G+ Toe t Goer +1 = 12+ ds. bibs +++ bebiba +++ bebrbe «+ 3 
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and we see that a has an eventually periodic decimal expansion. To complete 
the proof we must also consider the case when q has 2 or 5 as a factor. Let 
q = 2"5"q', where neither 2 nor 5 is a factor of q’; then 


275" yD _ p 


g 


10° *"q = 


say; by the previous argument, the decimal expansion of 10”*"a is eventually 
periodic. The expansion of a contains exactly the same digits (with the decimal 
point shifted m+n places), so it too is eventually periodic. 


Alternative proof (sketch). To show that every rational number a = p/q has 
an eventually periodic decimal expansion, suppose without loss of generality 
that 0 < a < 1, and consider how to compute the expansion 


a= 0.a,a2a3 aos 


by division. We divide 10p by q; the quotient is a, and the remainder, say, 
pi. Dividing 10p, by q gives quotient a2 and remainder p2, and so on. Since 
division by q gives only a finite number of possible remainders, the remainder 
pr must at some stage be the same as a previous remainder p;. From this 
point on the whole procedure repeats and we have az = aj, Pr+i = Pj+1; 
@r+1 = 4j41 and so on. Exercise. Write out this proof in more detail. 


Examples. 


e Consider the number 


a= S- 10-** = 1.10010000100000010000000010000-:- . 
k=0 


If the decimal expansion is eventually periodic with period length ¢, 
then the periodic part must contain t consecutive zeros, and therefore 
must be entirely zero — which is obviously false. So the decimal is not 
eventually periodic, and a is irrational. Similar examples (of which we 
shall later learn a good deal more) are 


S- 10-2" = 0.110100010000000100000000000000010000. - - 
k=0 


and Liouville’s number, 


b> 10~* = 0.1100010000000000000000010000:-- . 
k=1 


e The Champernowne constant is obtained by stringing together all 
the positive integers in their natural order: 


€ = 0.12345678910111213141516--- . 


This number is irrational by the same argument as above. 
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e The Thue sequence is defined inductively as follows: ag = 0, and 


12 


1 1 if ay = 0 

d2k =Ap, G@ =l-a= ; 

2k k Qk+1 k 0 ifa,—1 

for any k > 0. (Note that the first equation is consistent when k = 0.) 
Taking this sequence as the digits of an infinite decimal, we obtain the 
number 


7 = 0.11010011001011010010110--- . 


This is not an eventually periodic expansion. Proof. Suppose the con- 
trary. Then there exist integers t > 1 and N > 0 such that an+4 = an 
for alln > N; also, we may assume that ¢ is the smallest positive integer 
for which such an N exists. First, observe from the definition that if m 
is even, then a4, = 1 — am. Now let n be an integer not less than N, 
and such that n +t is even. Then by using the assumed periodicity of 
the Thue sequence, the definition and the previous observation, we have 


G2n4+t+1 = 42n+2t4+1 = 1— On+t = Antt4+1 = 42n42t+2 = G2n+t4+2 ; 


and so 2n+t-+1 cannot be even. Thus t is even, say t = 2s, and for all 
n > N, we have 


An+s = 42n+2s = G42n+t = A2n = an - 


But this is impossible as t is the least integer with such a property, 
while clearly s < t. The contradiction shows that the sequence is not 
eventually periodic. 


Corollary 1.8. The number 7 is irrational. 


Comment. An alternative characterisation of the Thue sequence: a, is 
the parity of the binary representation of the integer k. The sequence 
is also known as the Prouhet~Thue—Morse sequence; it has connections 
with a wide variety of fields including harmonic analysis, dynamical 
systems, differential geometry... and, potentially, penalty shoot-—outs 
in football [49]. 


IRRATIONALITY OF THE EXPONENTIAL CONSTANT 


Once we get beyond radical expressions and decimals, irrationality proofs, 
for the most part, become significantly harder. A notable exception is the 
irrationality of the exponential constant e. Apart from the intrinsic interest of 
the result, its proof provides our first glimpse of an idea which will recur again 
and again in irrationality arguments, and which we shall employ extensively 
in Chapters 2 and 5. 
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Theorem 1.9. The exponential constant e is irrational. 


Proof. Assume that e = p/q is rational. That is, 


Paitgtytyt: ' 
and for any positive integer n, we have 
AL eee ee 
q 1! 2! , 
where R (which depends on 7) is given by 
n! n! 


© Gani ! Gea 


We can estimate R in terms of a geometric series: 


1 x» 1 4 Z 1 1 
n+l (n+1)(n+2) 


1 
ee ee | 
Aedes n os 


In particular, choose n = q. Then 


pn} nl on! 
et agli a ok 
R 5 (m+ 54 o4 ar 


is clearly an integer; but using (1.1), we have 0 < R < 1. This is impossible, 
and so e is irrational. 


Observe that this proof relies essentially on an infinite series for e, and 
therefore has to involve concepts of calculus. In some sense this may be sur- 
prising, as number theory is usually thought of as studying discrete systems 
while calculus is the science of the continuous; in another sense there should 
be no surprise, as it is not even possible to define the number e without re- 
course to calculus techniques. Whether it is in fact a surprise or not, we shall 
find that many of our future proofs will be expressed in terms of calculus. 


1.4 OTHER RESULTS, AND SOME OPEN QUESTIONS 


It is known that 7 is irrational: we shall prove this in the next chapter. It is not 
hard to see that at least one of the numbers 7 + e and ze must be irrational 
(in fact, at least one must be transcendental — see Chapter 3); although, most 
likely, both are irrational, this has not been proved for either one individually. 
As a consequence of a difficult result due to Gelfond and Schneider (Theo- 
rem 5.18) we know that e” is irrational; however it is still unknown whether 
or not 7° is irrational. It can also be shown that various numbers such as, for 
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example, eV? and 2V? are irrational. However, the irrationality of tV? and 2°, 
and that of the Euler—-Mascheroni constant 
1 1 1 
= li (1 Pee Bene ) O57 ites: 
y= lim ; | ~ logn 


N—- Oo 


remain undecided. Another problem which has attracted much attention is to 
investigate the irrationality of the numbers ¢(n). Here n > 2 is an integer and 
¢ is the Riemann zeta function defined by 


for s > 1. By methods of complex integration we can show that if n is even 
then ¢(n) is a rational number times 7”, and this is known to be irrational. On 
the other hand, it is much harder to find out anything of interest about ¢(n) 
for odd n. In 1978 the French mathematician R. Apéry sensationally proved 
that ¢(3) is irrational. His complicated argument had the appearance of being 
completely unmotivated, and all of the techniques he had used would have 
been available two centuries earlier: for these reasons, few people believed 
that the proof could possibly be correct. Nevertheless it was found possible 
eventually to confirm all of Apéry’s assertions and thereby establish what has 
been called “a proof that Euler missed”. A brief (but not easy!) account of 
Apéry’s work is given in [66]. 


EXERCISES 


1.1 Assuming that /2 = p/q, simplify (2q — p)/(p — q). Use the result to 
give an alternative proof of the irrationality of /2. 


1.2 Generalising the previous question, suppose that k? < n < (k+1)?. 
Show that ./n is irrational by assuming that \/n = p/q and simplifying 
the expression (ng — kp)/(p — kq). 


1.3 We can show that a = ‘%/n is integral or irrational without relying in 
any way on properties of primes. Suppose that a is rational, a = p/q 
with q minimal, and write 6, = q™~*~!a™—*; then prove by induction 
on k that 6; is an integer for k = 0,1,...,m— 1. 


1.4 Let a and 6b be unequal positive rational numbers. Show that 
(a) if /a— Vb is rational, then a and V2 are rational; 
(b) if /a— Vb is rational then Y/a and V0 are rational. 
Challenge problem. Show that if a,b are positive rationals, a 4 b and 
*/a— Vb is rational, then ~/a and 0 are both rational. It may not be 
possible to do this with the methods we have introduced so far, though 
there is a reasonably simple solution using ideas from Chapter 3. 


1.5 
1.6 


1.7 


1.8 


1.9 
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1/r 


Find all positive rational r such that r°/” is rational. 


Prove that 1, W/2, W/4 are linearly independent over Q. That is, show 
that the equation a + b¥/2 + cv/4 = 0 has no rational solutions a, b,c 
other thana =b=c=0. 


Let p/q be a rational number between 0 and 1, with p,q having no 
common factor. Prove that 


ae 
v2 4 
Show that there are no positive integers a,b, except for a = 3, b = 2, 


such that 
pa v2tve 
V3+Vb 


1 
5 


>a 


is rational. 


Complete the statement and prove it: “if n is a positive integer and m 


” 


is prime, then log,, n is irrational unless... ”. 


Show that if a is a root of the polynomial az" + bz + c, where a,b and 
c are odd integers and n > 2, then a is irrational. Generalise. 


Show that a = sin il is irrational. 


Let r be a rational number. 


(a) Show that e”™? is an algebraic integer. 


(b) Show that cosrz is either 0, +3, +1 or irrational. 
(c) What are the possible rational values of cos? rz? 


Show that the decimal obtained by concatenating the digits of powers 
of 2, that is, 
0.1248163264128256 - -- 


is irrational. 
Repeat the previous question for powers of 13. That is, prove that 
0.113 169 2197 28561 371293 4826809 - - - 
is irrational. 
Prove that the Copeland—Erdés constant 
@ = 0.23571113171923293137414347 --- , 


whose decimal is obtained by concatenating the digits of the primes in 
increasing order, is irrational. You may assume Bertrand’s postulate: 
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1.16 


1.17 


1.18 


1.19 


1.20 


1.21 


for each integer n > 2 there is a prime p such that n < p < 2n. Although 
it is (still) traditionally known as a “postulate”, this result was proved by 
Chebyshev in 1850 and is definitely true! See, for example, [2], Chapter 2. 


Let a1, @2,a3,... be a strictly increasing sequence of positive integers, 
and let a be the real number obtained by concatenating their digits after 
a decimal point. Prove that if a is rational then there exist constants 
c>Oand x >1 such that a, > cx™ for all k. 


If py, is the kth prime, it can be shown that p,/klogk > 1 as k > oo. 
(This is related to the Prime Number Theorem: see appendix 3.3.) 


Use this and the result of the previous problem to give a proof (dif- 
ferent from that in exercise 1.15) that 0.23571113171923293137.--- is 
irrational. 


Let 
a = 0.12345678912345678902345678901345678901245--- , 


where the kth digit is the sum of the digits of k, reduced modulo 10. 


(a) Show that the decimal never has the same digit more than twice 
consecutively (and so our basic argument for the irrationality of a 
decimal, as in the examples on page 7, will not work). 


(b) Prove that a is irrational. 


Prove that a real number a is rational if and only if q! a is an integer 
for all sufficiently large integers g. Deduce that cos 1 is irrational. 


By considering the equation ae+ce~! = b, show that e is not a quadratic 
irrational; that is, e is not the root of a quadratic polynomial with integer 
coefficients. Deduce that e? is irrational. 


Generalised “decimal” expansions. Let { gn }°°, be a sequence of inte- 


gers greater than 1, and for each n let a, be an integer in the range 
0 < an < gn — 1. Write 


o— 


a a a 
i ec 
91 9192 919293 


Suppose that there are infinitely many n such that a, 4 0, and infinitely 
many n such that a, 4 g,—1; and that for each prime p, infinitely many 
gn are multiples of p. Show that a is irrational. 


Comment. “Normal” decimal expansions correspond to the case with 
9. = go = ++: = 10, which certainly does not satisfy the condition 
that infinitely many g,, are multiples of any prime. Thus, this exercise 
provides a nice complement to Theorem 1.7. 


1.22 
1.23 


1.24 


1.25 
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Show that a = eV? + eV? is irrational; deduce that eV? is irrational. 


Let P be a polyhedron with dihedral angles (that is, angles between 
adjacent faces) a1,...,@; and Q a polyhedron with dihedral angles 
G1,..-, 4. It can be shown (see, for example, [2], Chapter 10) that if 
P can be dissected into finitely many pieces which can be reassembled 
into (a congruent copy of) Q, then 


pray t+: + pss = Pr t+: + qb +r7 


for some integer r and some (strictly) positive integers pj,...,Ds, 
qi;---;q@- Use this result to prove that it is impossible to dissect a 
regular tetrahedron and reassemble the pieces to form a cube. 


Book X of Euclid’s Elements contains definitions and propositions about 
irrationality. It is confusing and frustrating to read, partly because there 
is no algebraic notation, but mainly because in Euclid’s terminology (as 
usually translated into English), a “rational line” is one whose square 
is rational—in-the—modern-sense. Two lines are said to be “commensu- 
rable” if their ratio is rational-in-the-modern-sense, and “commensu- 
rable in square” if the ratio of their squares is rational—in-the—modern 
sense. “Commensurable in square only” means commensurable in square 
but not commensurable. Bearing all this in mind, express the following, 
taken from [24], in modern notation, and prove it. 


Book X, proposition 73. If from a rational straight line there is sub- 
tracted a rational straight line commensurable with the whole in square 
only, then the remainder is irrational. 


For readers who are familiar with complex integration and the method 

of residues. Use the following steps to show that if n is even then ¢(n) 

is a rational multiple of 7”. Also, indicate where the proof breaks down 

if n is odd. 

Let f(z) = z~"cotz, and let C be the square contour in the complex 

plane having vertical sides at z = +(N + 4)z and horizontal sides at 

y= +(N + 4)n. 

(a) Show that f has simple poles (that is, poles of order 1) at the points 
z= kr with k = +1,+2,..., and find the residue at each of these 
poles. 


b) Explain wh has a Laurent series in powers of z given by 
y 
F(z) = bngaz OY 4.4 diz +agtaizt--. 


Use the identity z"*' f(z) sin z = zcosz to show that every coeffi- 
cient b; is rational. Deduce that 


Pet 
[x reotzds = 2ni(m +=) ; 


k=1 
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where ry, is a rational number. 
(c) Use the identity 


2 a 
cos* x + sinh 
jcot 2)? = SS 
sin“ x + sinh” y 
to show that | cot z| < 1 on the vertical sides of the square, and 
| cot z| < coth(a/2) on the horizontal sides. Deduce that 


[ ereoteds 0 as N + co 
C 


and hence, finally, that ¢(n) is a rational multiple of 7”. 


APPENDIX: SOME ELEMENTARY NUMBER THEORY 


This section contains some basic number-theoretic definitions and results 
which you ought to know. Proofs in this section are abbreviated or omit- 
ted, and you should be able to supply proofs for yourself. If necessary, this 
material can be found in any work on elementary number theory. The most 
popular of the classic texts are regularly revised, thereby offering a proven 
exposition together with additions which bring the content and presentation 
up to date. From a very crowded field we mention Hardy and Wright [28], [29], 
Niven and Zuckerman [45], [46] and Baker [10]. 


Lemma 1.10. The division algorithm. If a and b are integers with b > 0, 
then there exist integers q andr such thata=bq+rand0<r<0b. 


Using the division algorithm recursively gives the Euclidean algorithm for 
computing the greatest common divisor of two integers, not both zero. 


Lemma 1.11. The Bézout property. If a and b are integers, not both zero, 
and g is the greatest common divisor of a and b, then there exist integers x 
and y such that ax + by = g. 


Given specific a and b, you should know how to use the Euclidean algorithm 
to find g, x and y. 


Lemma 1.12. [fa and m have no common factor and a| mn, then a|n. 


Definition 1.4. Let m be a positive integer. We say that integers a and b are 
congruent modulo m, written a= b (mod m), ifm |a-— ob. 


To “reduce an integer a modulo m” means to find an integer b such that 
a = b (mod m) and 6 lies in a “suitable” range, usually 0 < b < m. That 
this can always be done is a consequence of the division algorithm. Although 
congruence notation is just another way of expressing a divisibility relation, 
and in that sense “nothing new”, it is very useful because congruence shares 
many of the basic properties of equality. 
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Lemma 1.13. Properties of congruence. Let m be a positive integer. 
e Equivalence properties. For any integers a,b,c, we have 
° a@=a (mod m); 
o ifa=b (mod m), then b=a (mod m); 
o ifa=b (mod m) andb=c (mod m), then a=c (mod m). 
e Congruence properties. [fa = b (mod m) and c=d (mod m), then 
o a+c=b+d (mod m); 
o a—c=b-—d (mod m); 
°o ac= bd (mod m). 
Tf, also, n is a non-negative integer, then 
°o a” = b” (mod m). 


e A cancellation property. Let a,b,m and s be integers. If sa = sb 
(mod m) and gcd(s,m) =1, then a= b (mod m). 


Definition 1.5. A prime number is an integer greater than 1 which has no 
(positive) factors except for itself and 1. 


Lemma 1.14. Characterisations of primes. Let p > 1. Then the following are 
equivalent: 


e p is prime; 
e for all integers m,n > 0, ifp= mn thenm=1 orn=1; 


e for all integers m,n, if p|mn then p|m or p|n. 


We remark that in more general situations, the matter of primes is approached 
differently. The definition given above is, properly speaking, the definition of 
an irreducible rather than a prime, and the definition of a prime is the third 
property in the lemma. From this point of view, the lemma shows, essentially, 
that primes and irreducibles are the same in the integers. In extended number 
systems known as integral domains an irreducible need not be prime, though 
it is still true that a prime is necessarily irreducible: see, for example, [62], 
section 4.5. 


Exercise. Let 


S={a+bV—-5|a,beZ}. 
Show that in S' the integers 2 and 3 are irreducible but not prime. 


Theorem 1.15. The Fundamental Theorem of Arithmetic. Every positive 
integer can be expressed in a unique way as a product of primes. 


Comment. We regard 1 as a “product” of no primes. 
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Definition 1.6. Euler’s function ¢: for any positive integer m we define 6(m) 
to be the number of the integers 1,2,...,m which are relatively prime to m. 


Theorem 1.16. Fermat’s “little” Theorem and Euler’s Theorem. 


e Let p be prime. Then for any integer a, we have a? = a (mod p); and 
if, in addition, a is not a multiple of p, then a?~+ = 1 (mod p). 


e Let m be a positive integer and a an integer relatively prime tom. Then 
a?) = 1 (mod m). 


Proof. To prove Euler’s generalisation of Fermat’s Theorem, write s = ¢(m) 
and let b;,b2,...,b; be those integers from 1, 2,...,m which are coprime to m. 
Form the products ab,,ab2,...,ab; and reduce them modulo m. The results 
are coprime to m, are in the range 1,2,...,m and are all distinct; therefore 
they are (in some order) the same as b1, b2,...,bs. Hence 


bibg---b, = (ab) (abe) +--+ (abs) (mod m) ; 


Euler’s Theorem follows from this, and Fermat’s Theorem is an easy conse- 
quence. 


CHAPTER O 


Hermite s Method 


Talk with Hermite... the more abstract 
entities are to him like living creatures. 


Henri Poincaré 


DEFECT OF CERTAIN TYPES of irrationality proof is that they apply 
largely, even perhaps solely, to artificially constructed examples. A case 
in point is the Champernowne constant 


0.12345678910111213141516--- 


discussed in Chapter 1. In many ways a much more 
attractive problem is to investigate the irrational- 
ity of what might be called “naturally occurring” 
numbers. We have already seen (exercise 1.2) that 
/n is irrational for positive integers n other than ; 
perfect squares; while no doubt “naturally occur-  johann Heinrich Lambert 
ring”, these numbers form a somewhat limited class. (1728-1777) 

We know too that e and e? are irrational (Theorem 1.9 and exercise 1.20): 
these facts were first proved, essentially, by Euler in 1737. Another obvious 
candidate for investigation is 7. In fact, 7 is also irrational, a result originally 
due to J.H. Lambert ([37], English translation [54]) in 1761. 


Lambert used relationships between infinite series and continued fractions, 
in particular, the formula 


s s 
tan - = 


to prove that if r is a non-zero rational number then tan r is irrational. Equiv- 
alently, if tanr is rational and non~zero, then r is irrational, and it follows by 
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taking r = im that 7 is irrational. By similar methods Lambert also showed 
that if r is rational and r 4 0, then e” is irrational. In Chapter 7 we shall study 
Lambert’s approach to these questions, making use of arguments inspired by 
Lambert’s use of continued fractions. 


In the nineteenth century, Lambert’s irrationality theorems were recon- 
sidered by Charles Hermite, who succeeded in giving new proofs based upon 
entirely different ideas; Hermite’s methods have turned out to be more valu- 
able than his predecessor’s. In fact, Hermite gave two related but distinct ways 
to attack such problems. One [31] involves what later became known as Padé 
approximants: quotients of polynomials that provide good approximations to 
the exponential function. The other [30] gives less information about the ex- 
ponential function as a whole but simplifies the investigation of certain values 
of e”. It is this second approach that we now discuss. 


2.1. IRRATIONALITY OF e” 


In the actual details of the final proof, Hermite’s method is (at least for the 
earlier results) not too difficult. However, the motivation behind the proof 
can be obscure. Therefore, instead of giving the proofs straight away, we shall 
start by trying to explain the aims and ideas behind a relatively simple case. 
We wish to generalise results of Chapter 1 by showing that if r is rational 
then e” is irrational, with the obvious exception that e° = 1. 

As usual we seek a proof by contradiction: take r = a/b with a 4 0, and 
suppose that e” = p/q. Following the method of Theorem 1.9, we try to obtain 
a contradiction by constructing an integer that lies between 0 and 1. Hermite’s 
idea, which originated in his study of approximations to e”, was to consider 
the definite integral 


[ f(a)e* dx , (2.1) 
0 


and to identify a function f which will give us what we want. Integrating by 
parts yields 


[ f@eae= (sme -10)- [ r@erar, 
0 0 


and since the integral on the right-hand side has very much the same form as 
that on the left, we may apply the same procedure repeatedly to obtain 


[t@ea=(-FO+F'0 Je (f(0)— 70) +f") - +). 


Here the right-hand side purports to contain two infinite series and therefore 
must be treated with caution, but if we choose f to be a polynomial, then the 
sums will actually involve a finite number of terms only, and we shall have no 
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convergence problems. We write 


F(a) = f(x) — f(a) + f(a) — f"(@) ++: , 
so that . 
| f(a) e" dx = F(r)e" — F(0) , 
0 
and the next step is to make some sort of evaluation of the right-hand side. 
An idea that will help is to notice that F'(0) will be simple if f has a large 
number of derivatives that vanish at « = 0; that is, f(a) should have many 
factors of x. Similarly, f(a) should have many factors of r—« in order to keep 
F(r) simple. So we set 
f(x) = cx™(r— x)", 


where c, a constant, and n are yet to be chosen. Now 


f) (0) = k! x { coefficient of x* } 
n 
— kl 2n—k al k—n 
(0) ten 


ifn <k < 2n, and f(0) = 0 otherwise. Recall that our aim is to make 
the integral (2.1), or something similar, an integer. The expression for f*) (0) 
contains a factor r2"—", and this could have a denominator as big as b”. 
Therefore, we choose c = 6”; then f(*)(0) is always an integer, and so is F'(0). 
Either by invoking the symmetry of f or by direct calculation, we note that 


f(r—a)=f(@) = (-1)*f@(r—2) = f(a) 
= f(r) =(-1* FO). 


Therefore, F'(r) is an integer too, and so is 


(2.2) 


af flae de = pF) -aF0) (2.3) 
0 


We wish to show that this integer lies between 0 and 1, and thereby obtain 
a contradiction. Now the integrand on the left-hand side is the product of 
a (positive) exponential factor, and a polynomial which is zero at the points 
x = Oand x =r, strictly positive in between. Therefore, the integral is strictly 
positive. Within the interval of integration, f(x) is a constant times a product 
of 2n terms, each at most r; therefore 


. 
0< | f(a) e® dx < qrer™™e" = pb" r?"*" . 
0 


This is clearly not small and our attempt has failed! But returning to (2.2), 

we notice that f) (0) always has a factor n!. Therefore, if we redefine c to be 

b”/n!, then the expression (2.3) is still an integer and our estimate becomes 
(ar?) 


Tr 
0< af f(a) e® dx < qrer*"e" = 
5 n! 
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Finally, remember that we have not yet chosen n. Note that p,r and b are 
fixed numbers, unchanged when n changes; and recall that if y is any real 
constant, then y”"/n! > 0 as n — oo. Therefore, by choosing n large enough 
we obtain 


o<af f(a)e* dx <1, 
0 


and we have the desired contradiction. 

Having seen the ideas behind Hermite’s proof, let’s now carefully write 
out the details: we shall give a variant of Hermite’s proof due to Niven [44]. 
Since e” is rational if and only if e~” is rational, we may assume that r is 
positive (which was, in fact, tacitly assumed in the above working — where’?). 
The point about evaluating the derivatives of f will be important in future 
proofs, so we isolate it in a lemma. 


Lemma 2.1. Derivatives of polynomials. Let a and b be integers, with b 4 0. 
Let n be a non-negative integer; define the polynomial f by 


f(x) = (a— bx)" g(a) , 
where g is a polynomial with integral coefficients. 


e If g has degree at most n, then for all k > 0 the derivative f (a/b) is 
an integer divisible by n!. 


e Ifb=+1, the same conclusion holds, irrespective of the degree of g. 


Proof. The kth derivative of (a — bx)”, evaluated at « = a/b, is zero unless 
k =n, in which case the derivative is (—b)"n!. Differentiating f by Leibniz’ 


formula 
k 


d* (uv) 250 k; @u dFv 
dak =: jg) dxi darks’ 


we see first that if k <n, then every term vanishes at x = a/b. If k > n, then 
the only surviving term is that with 7 =n, and so 


7(§) = (C) curmrst-o(g), 


But since g-”) has integral coefficients, g*—”) (a/b) is a rational number 
with denominator at most b*°89. So, if either degg < n or b = +1, then 
bg") (a/b) is an integer, and hence f *)(a/b) is an integer times n!. 
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Theorem 2.2. (Lambert). [fr is a non-zero rational 
number, then e” is irrational. 


Proof (Hermite/Niven). Let r = a/b with a 4 0, and 
suppose that e” = p/q is rational. As explained above, 
we may assume without loss of generality that r,a and 
b are positive. Now let 


x" (a — ba)” 
rt >) 

n! Charles Hermite 
(1822-1901) 


f(x) = 


where n is a positive integer which will be specified 
later. Integrating by parts repeatedly, we have 


[ te@erae=(sne" - ee ode 
= (F(r)e" — f(0)) - (#'e" 0) + frac" de 


l| 
z 
a, 

3 
YS 

o 
~ 

| 
zy 
— 
j=) 
— 


where 


F(x) = f(x) — f'(a) + f"(2) 
Since f is a polynomial, the series is finite and there are no convergence 
problems. Lemma 2.1 (derivatives of polynomials) shows that f(r) and 
f“ (0) are always integers; thus 


af Hoye de = pF) ~ 40) (2.4) 
0 


is also an integer. However, on the interval 0 < « <r the polynomial f(z) is 
always positive and has absolute value at most ar” /n!; hence 


0< af fla)e* de < gret (0) : 


If n is chosen sufficiently large, the right-hand side is less than 1, which 
contradicts the fact that (2.4) is an integer; this proves the theorem. 


A different combination of similar ingredients gives us an alternative 
proof. Again let r = a/b be a non-zero rational number, and suppose that 
e’ = p/q. Define 


“ — ba) 
I=af ee 2 ae a= 4 [ (ax — bx”)"e dx . 
0 n! 


Then we have 


Ihb=p-q, =a(p+q) — 2b(p—4q) ; 
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integrating by parts yields 
Pie if (= + 6)b(ax — bx?)r*1 se a*(ax — cals o* dr 
0 (n+ 1)! n! 
which can be written 
Inga = —(4n+ 6)bIng1 +07In « 


It is clear from this recurrence relation and the above initial conditions that 
I, is always an integer. However, estimating the integral as in the previous 
proof gives the inequality 0 < |I,,| < 1 for sufficiently large n, a contradiction. 
We invite the reader to fill in the details of this proof. 


Corollary 2.3. Irrationality of logarithms. If r is rational, positive and not 
equal to 1, then logr is irrational, where log denotes the natural logarithm of 
a positive real number. 


2.2 IRRATIONALITY OF x 


Similar ideas can be used to prove the irrationality of 7, and more generally, 
of cosr if r is rational and non-zero. Our aim will be to integrate f(x) sin x 
for suitable functions f; we shall find that the polynomial used in previous 
proofs is not always suitable. 


Theorem 2.4. 7 is irrational. 


Proof. Suppose that 7 = a/b; define 


fax (ax — ba*)” = x" (a — ba)” 


n! nl 


and once again integrate by parts: 


[ fosincas = (f(t) + f(0)) +f f'(a) cosa dz 
0 0 


= (f(z) ale f(0)) -— [ f"(a)sinaxdz . 


In the second integration, we have used the fact that sina = sin0 = 0. Con- 
tinuing to integrate in the same way we obtain 


where 
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Using Lemma 2.1 (derivatives of polynomials), and recalling that by assump- 
tion 7 = a/b, we see that both F(a) and F(0) are integers. But f(x) sina is 
always positive for 0 < x < m7 and we have 

(an)” 


n! 


0 ay f(x) sina dz <1 
0 


If n is sufficiently large, the right-hand side is less than 1, and we have a 
contradiction in the usual manner. Therefore, 7 is irrational. 


Comment. We might reasonably expect to obtain a similar proof by using 
the integral 


[ f(x) cosa dx 


with the same polynomial f(x). In fact, the attempt fails utterly! (Exercise. 
Explain why.) We can, however, prove the irrationality of 7 by considering 


a a 7 cos x dx ; (2.5) 


ae n! 


though in this case, the integrand is not always positive for —7 < « < 7, and 
the fact that the integral is non-zero (while possibly “obvious” from figure 2.1) 
is slightly tricky to prove carefully. 


y = (a? —b?x?)” 7 ae 


Figure 2.1. The graph of y = (a? — b?x?)" cosz. 


2.3. IRRATIONAL VALUES OF TRIGONOMETRIC FUNCTIONS 


The proof of Theorem 2.4 can be viewed in a slightly different light. Taking 
r = 7, we assumed that r is rational and used the fact that cosr is rational 
to reach a contradiction. However, the proof relied vitally on the fact that 
sin 7 = 0, so one could not expect exactly the same proof to work for arbitrary 
rational r. Nevertheless, it turns out that by modifying the proof somewhat, 
we can prove the following result. 
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Theorem 2.5. Irrationality of cosines. [fr is rational and not zero, then cosr 
is irrational. 


Proof. Let r = a/b be a non-zero rational; assume that cosr = p/g. Without 
loss of generality assume that a and b, and hence r, are positive. Choose 


f(z) =2"(a— bx)" (2a — bx)” ; 
in this case we find it more convenient not to include n! in the denominator. 


Integrating twice by parts yields 


| f(x)sina dx = f(0) — f(r) cosr + f(r) sinr -f f"(a)sinaxdz ; 
0 0 
repeating the procedure and writing 


F(z) = fe) — f"(@) + {M'@)-IM@+-- 


gives eventually 
/ f(x) sina dz = F(0) — F(r)cosr+ F'(r)sinr . (2.6) 
0 


Now observe that f(a) is a polynomial in (a — bx), since 


(a — bx)?" (a? — (a — bx)?)” 


f(x) = — : 


if we set g(x) = #?"(a? — 2)", then g is an even function and so g)(0) = 0 
whenever k is odd. But we have 
f(x) =0-"g(a— br) => FO (x) = (—b)*b-"g® (a — ba) 
=> f(r) =—b'-"g(0)=0 for odd k, 


and so F’(r) = 0. Therefore, we can rewrite (2.6) as 
af f(x) sina dx = qF(0)— pF(r) . (2.7) 
0 


Now applying the lemma on derivatives of polynomials shows that f (4) (7) is 
a multiple of (2n)!, and hence also a multiple of (n + 1)!, for all k. In the 
case of f(*)(0), we need a little more information than is given by the lemma. 
Since 

f (0) = k! x { coefficient. of «* } 


we see that f")(0) is zero for k < n and is divisible by (n + 1)! for k > n. 
Moreover, for k = n we have 


fF (0) = n!2"a%” . 
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If n+ 1 is an odd prime greater than both a and q, we consider 


qF (0) = +q(f (0) — #+)(0) + #40) —---) . 


Every term on the right-hand side is a multiple of n!, and every term except 
the first is a multiple of (n+ 1)!. Hence gF'(0) is an integer which is a multiple 
of n! but not of (n+1)!. We have seen above that F'(r) is a multiple of (n+1)!. 
Finally, therefore, the right-hand side of (2.7) is an integer divisible by n!, 
and not zero because it is not divisible by (n + 1)!; so 


|gF (0) — pF(r)| 2 nt. 


The remainder of the proof follows familiar lines. By estimating the integral, 
we have 


| af f(x) sin x dz| < gr r”a?"(2a)” = gr (2ra°)” , 
0 


and for large n this is less than n!. So if n+ 1 is a sufficiently large prime (in 
particular, greater than a and q), then our estimates for the left and right— 
hand sides of (2.7) are incompatible. We have obtained a contradiction and 
thereby proved the theorem. 


Comments. What are the features that make the above proof work? In par- 
ticular, where did the choice of f come from? First, observe that we could 
have calculated the integral (2.6) before having specified the polynomial f. 
We would then have noted that the right-hand side contains a term F’(r) sinr, 
and we know absolutely nothing about the factor sinr. It would appear, then, 
that the only hope of success is to choose f in such a way that F’(r) is zero. 
Since F’(a) involves only odd-order derivatives of f, we give f a factor of r—ax 
to an even power, and “balance” the factor 2” with a factor (2r—2)", so that 
f is symmetrical about z = r. This yields the properties we need. 


Note also the new method of proving that the expression (2.7) is not zero. 
We cannot do this, as before, by arguing that the integrand is positive, since 
the factor sina might change sign frequently between 0 and r. Compare the 
graph in figure 2.1. 


Corollary 2.6. Irrationality of sines and tangents. Ifr is a non-zero rational 
number, then sinr and tanr are irrational. 


Proof. Let r be rational, r # 0. If sinr is rational, then by considering the 
double—angle formula 
cos2r = 1—2sin?r 


we see that 2r is a non—zero rational with cos 2r rational. This contradicts the 
above result. The proof for tanr is similar, using the formula 
cos?r—sin?r 1—tan?r 


cos“ r + sin* r 1+tan*r 
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EXERCISES 


2.1 


2.2 
2.3 


2.4 


2.5 


2.6 


Let 


1 
Lp =o (1— 27)" cos ra dz . 
-1 


Find a reduction formula giving I,42 in terms of [,41 and J;,, and use 
it to show that 7? is irrational. 


Prove that the area under the graph in figure 2.1 is not zero. 


By considering the integral 


Tw 
I =| f(a) sine dx 
0 
for a suitable polynomial f(x), show that if c is a positive integer then 
m/c is irrational. Hence give a one-line proof of the irrationality of 7. 


Let r be rational and not zero. Show that cosh r is irrational, and deduce 
that sinhr, tanhr and e” are irrational. 


By considering 
1 
l= f(x)sinrxadx where f(x)=a2"(1—a2)"(2-2)", 
0 


prove that if r? is a non-zero rational number, then cosr is irrational. 


The Bessel function of the first kind of order v, defined by 


a) (—1)* ae\ 2k+v 
10) S gp B 
@) > aT +k+1)\2 
is encountered in the solutions of many important differential equations. 
It has the property 
tIy41(£) = QJ, (x%) — eJy_1(x) , 


which is easy to prove, and 


1 a\y fr 
ic) => ———— {| = 6) sin?” Add f See. 
(x) Fie eH (a) | cos(a cos @) sin or V 5 


which is not. The gamma function, which appears in the above formulae, 
is defined by 


r(e)= f te" at 
0 


for real a > 0. It is increasing for x > 1 and satisfies [(n + 1) = n! when 
nm is a non-negative integer. 
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(a) Let r? = a/band v = c/d, and suppose that rJv4i(r)/Jv(r) = p/¢. 
Prove that 
Pr” Ivan(r) =7rundv4i(r) + Vndr(r) 


for n > 0, where the sequences u, and v, are defined by 
Uns = 2(vtn)un —72tn—1, Unt1 = 2(V+n)vn — 7? Un_1 


with suitable initial conditions. 


(b) Show that if n is an integer, n > 0, then gb"d"r" Jp4n(r)/JL(r) is 
an integer. 


(c) Let t be an integer greater than —y + 4. For sufficiently large 
integers n, obtain contradictory estimates for 


gb? ttdnttpntt A ere (r) 


Jy(r) 
and hence... 
(d) ...conclude that if v is rational, r? is rational and not zero, and 
JL(r) #0, then 
rJy4i(r) 
Jy(r) 


is irrational. 


(e) The gamma function also has the property that [(a + 1) = 2I'(a) 
for all « > 0, including non-integer 7. Use this to deduce that 
under suitable conditions r tan is irrational. 


2.7 For readers who know some elementary complex analysis. We follow a 
method of Desbrow [22] to prove that if r? is a non-zero rational number, 
then r tanh r is irrational. Write 


and assume that r? = a/b and rtanhr = p/q are both rational. 
(a) Explain why f has a Taylor series 


Co 


f@y= > ele-1" 


n=0 
valid for all z € C. Show that the coefficients of the series, except 
for cg, can be found by integrating around a circle C in the complex 
plane with centre 1 and radius n? to give 


1 £2) 


Cn = 5 


Qri Jo (2 —1)"*1 
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(b) Show that w = f(z) is a solution of the differential equation 
4zw" +2w’ —r?w =0, 
and hence find a second-order recurrence for the coefficients cp. 


Evaluate cp and cj. 


(c) Prove that d, = n!2"b"qcen/co is an integer for all n and is non— 
zero for infinitely many n. 


(d) Show that |f(z)| < e"7!'” for all complex z. 


(e) Use these results to obtain a contradiction and so prove the result 
claimed. 


APPENDIX: SOME RESULTS OF ELEMENTARY CALCULUS 


Proofs of the following results are again left up to the reader. If assistance is 
required, any basic calculus text should suffice. 


Lemma 2.7. For any real constants 6 and y, we have 


ey” 


>0 asn>o. 
n! 


Corollary 2.8. Comparison of exponentials and factorials. If 8 and y are 
real constants, then |B "| < n! for all sufficiently large integers n. 


Lemma 2.9. Derivatives of even and odd functions. Let f be a differentiable 
function from R to R. Then 


e f is even if and only if f' is odd; 
e if f is odd then f' is even. 


Exercise. Why is the converse of the second result not true? Fill in the gap 
and then prove the statement: 


“if f’ is even and... then f is odd”. 


Corollary 2.10. If g is an odd function, then g(0) = 0 for all even k. If g 
is an even function, then g\)(0) =0 for all odd k. 


Lemma 2.11. Derivatives of a product (Leibniz’ rule). [fk > 0, then 


d* (uv) _ 3 k\ du diy 
dak o=; gj) dxi dak-i © 
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Lemma 2.12. Estimation of integrals. If |¢(a)| <M for all x € [a,b], and 
if the integral exists, then 
b 
jf o@ae 


i (a) dx 


provided the maximum exists. 


<M\|b-al|. 


In particular, 


< a 
= |b—¢| max lel)! 


Taylor & Francis 
Taylor & Francis Group 


http://taylorandfrancis.com 


CHAPTER 3 


Algebraic and 
Transcendental 
Numbers 


The meaning doesn’t matter 
if it’s only idle chatter 
of a transcendental kind. 


W.S. Gilbert 


RATIONAL NUMBER is one which is the root of a linear polynomial gz — p 

with integer coefficients. The next level of complexity in the arithmetic 
properties of real and complex numbers is reached by considering roots of 
polynomials of higher degree, still having integer (or, equivalently, rational) 
coefficients. The reader may care to ponder whether or not it is obvious that 
a complex number need not be a root of any such polynomial. 


3.1. DEFINITIONS AND BASIC PROPERTIES 


We begin with a definition which was given in a slightly different, but equiv- 
alent, form in Chapter 1. 


Definition 3.1. If the complex number a is a root of a polynomial 
Anz” +An—12"” 1 +++ +ayz+a (3.1) 


with rational coefficients and an ~ 0, then a is said to be algebraic. Any 
(complex) number which is not algebraic is called a transcendental number. 
More generally, if a is a root of a polynomial such as (3.1) with coefficients in 
a field F, then a is said to be algebraic over F; if there is no such polynomial 
then a is transcendental over F. 
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Lemma 3.1. Properties of algebraic numbers. 


e A complex number a is algebraic (over Q) if and only if it is a root of a 
non-zero polynomial with integral coefficients. 


e Ifa is algebraic, then there exists a unique monic polynomial fa having 
rational coefficients and smallest possible degree, such that fa(a) = 0. 
If g is any polynomial with rational coefficients such that g(a) = 0, then 
g is a multiple of fa. 


e The polynomial fa, is irreducible over Q. That is, fo, cannot be factorised 
as the product of two polynomials with rational coefficients and degree 
smaller than that of fa. 


Proof. To prove the first assertion, just multiply a polynomial with ratio- 
nal coefficients by a common denominator for its coefficients. In the second 
statement the existence of fq is clear; if g(a) = 0 then dividing g by fa gives 


g(2) = falz)a(z) + r(2) 


with r(a) = 0. But r has smaller degree than f., so r is the zero polynomial 
and hence g is a multiple of f,. The uniqueness of f, follows since if there 
are two polynomials with the minimal—degree property then each is a factor 
of the other. To prove irreducibility note that if there is a proper factorisation 
fa = gh then either g or h has a as a root, thus contradicting the minimality 
of fa. 


Definition 3.2. The polynomial fy of the above lemma is called the minimal 
polynomial of a. The degree of an algebraic number is the degree of its 
minimal polynomial. If the minimal polynomial of a has integral coefficients, 
then a is called an algebraic integer. Algebraic numbers having the same 
minimal polynomial are said to be conjugate to each other. 


Lemma 3.2. Gauss’ Lemma. Let f be a polynomial with integral coefficients 
and suppose that f = gh, where g and h are polynomials with rational coeffi- 
cients. Then there is a rational constant c such that cg and h/c have integral 
coefficients, and so f is a product of polynomials with integral coefficients and 
the same degrees as g and h. 


Proof. First, observe that if the polynomial 7 has integer coefficients with no 
common factor, and if h also has integer coefficients with no common factor, 
then the same is true of the product gh. For let p be a prime and suppose 
that p divides go, g1,---,9k—1 but not gz; and that p divides ho, hi,...,he—1 
but not he. Then p is not a factor of 


Gohieze + gihepe1++++ + gr—theri + gehe + grythe-1 +--+ + 9rseho , 


which is one of the coefficients of gh. Thus there is no prime which divides 
every coefficient of gh. To prove Gauss’ Lemma, let f = gh where g and h 
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have rational coefficients. Consider the coefficients of g: by extracting the least 
common multiple of the denominators and then the greatest common divisor 
of the numerators, we can write g = (s/t)g, where @ has integral coefficients 
with no common factor; similarly, h = (u/v)h. Thus 


tu f =sugh. 
Since the coefficients of 7h have no common factor, su is a multiple of tu 


and so (s/t)(u/v)g has integral coefficients. The lemma follows upon choosing 
c=u/v. 


Corollary 3.3. If a is a root of any monic polynomial with integral coeffi- 
cients, then a is an algebraic integer. 


Comment. It follows from this corollary that the definition we have just 
given of an algebraic integer is equivalent to the (slightly different) definition 
we gave in Chapter 1. 


Examples. 


e The polynomial z? + 3 is clearly irreducible over Q. It has two roots 
tiv3, each of which is, therefore, an algebraic integer of degree 2. 


e By using multiple angle formulae, or otherwise, it is possible to show that 
cos 47, cos 2m and cos 27 are the roots of the cubic 827 — 42? — 4z + 1. 
This polynomial is irreducible, so the roots are algebraic of degree 3, and 
are conjugate to each other. Note that in this case, conjugate algebraic 


numbers are not complex conjugates. 


e The fifth root of unity ¢ = e?7*/* is a root of z®° — 1. This polynomial is 
not irreducible since 


2-—1l=(g-1)(44 2 +2? +241), 


but the quartic factor then is irreducible. (Proof. The quartic has no 
rational roots and therefore no linear factor; if, therefore, it is reducible 
then it is the product of two quadratics. By the above lemma we may as- 
sume the factors have integral coefficients; multiplying out and equating 
coefficients we soon obtain a contradiction.) Therefore, ¢ is an algebraic 
integer of degree 4. Its conjugates are (7, ¢3 and ¢7. 


e The numbers e and z are both transcendental over Q. We’ll give proofs 
in Chapter 5. 


3.1.1. Proving polynomials irreducible 


The next two results are often useful for proving irreducibility of polynomials. 


Lemma 3.4. Eisenstein’s Lemma. Let f be a polynomial with integral coeffi- 
cients, 
F(z) = Anz" “+ Gn—12" 4 +t $ayZ+ a0 5 
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suppose that there is a prime p such that p is a factor of ag,a1,..-,@,—1 but 
not of dn, and such that p? is not a factor of ag. Then f is irreducible over 
the field Q of rational numbers. 


Proof. By Gauss’ Lemma, we need only show that f cannot be written as 
the product of two polynomials with integer coefficients and degree less than 
n. Suppose that there are two such polynomials; without loss of generality we 
may assume that they have the same number of terms, say 


F(z) = g{z)h(z) = (0mz™ +--+ + b12 + bp) (G2 +--+ +12 + 6) 


with m < n. Looking at the constant terms, boco = do is divisible by p but 
not by p?; by symmetry, we may assume that p | bo and p ¢{ co. Multiplying 
out all the other coefficients shows that p is a factor of 6, b2,...,bm too. But 
then p is a factor of every ax, including a,,, and this is contrary to our initial 
assumption. So f is irreducible. 


Examples. 


e The polynomial z? — 122? + 345z — 6789 is irreducible since 3 is a prime 
factor of 12, of 345 and of 6789 but not of the leading coefficient, and 
since 3 is not a factor of 6789. 


e A slightly more subtle application of Eisenstein’s Lemma simplifies the 
proof of irreducibility for 


ila Ai g8o yt 52 
f(Xy= = Dinas +2er-t2°4tz41. 
It is not hard to see that we can factorise f(z) if and only if we can 
factorise 


(z+1)?-1 
z 


f(iz+l= = 2445234 1027+10z+5 ; 
but Eisenstein’s Lemma with p = 5 shows immediately that f(z + 1) is 
irreducible, and hence so is f(z). 


e In fact, if p is any prime then f(z) = 2?-!+ z?-?+---+2z+41 can be 
proved irreducible by the same method. Exercise. Give the details! 


Notation. By Z,,, we denote the ring { 0,1,2,...,m—1} of integers modulo a 
positive integer m. Whenever we work in this ring it is to be understood that 
addition, subtraction and multiplication are performed modulo m. Any reader 
requiring a brief review of modular (congruence) arithmetic should consult the 
appendix to Chapter 1. 


Lemma 3.5. Factorisation of polynomials modulo m. Let f be a polynomial 
with integer coefficients and suppose that m is not a factor of its leading 
coefficient. Write fm for the polynomial obtained by reducing the coefficients 
of f modulo m. If f has a factor of degree n over Z, then fm has a factor of 
degree n over Zp. 
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Sketch of proof. Suppose that f = gh, where g has degree n. Since equality of 
integers implies congruence to any modulus, we have fm = gmhm. Moreover, 
gm cannot have degree greater than n, and will have degree less than n only 
if m is a factor of the leading coefficient of g; but then m would be a factor 
of the leading coefficient of f, contrary to assumption. Thus g,, has degree n. 


Corollary 3.6. Testing reducibility with modular arithmetic. If f, a polyno- 
mial with integral coefficients, is reducible over Z, and if m is not a factor of 
its leading coefficient, then fm is reducible over Zy,. 


We shall commonly apply this corollary in contrapositive form: if there is 
any m, not a factor of the leading coefficient of f, for which fm is irreducible, 
then f is irreducible. This can be a good way to prove a function irreducible, 
as there is only a limited number of irreducible polynomials with coefficients 
in Z,,; however, finding a suitable modulus m may require a little ingenuity 
—or a lot of trial and error! 

Examples. 


e Let f(z) = 23 — 42? +9z +16. Choose m = 3; we have 
fa(z) = 27 +227 +1. 


If fs is reducible it must have a linear factor. However, we can easily 
calculate in Z3 that 


fs(0)=1, fs(1)=1 and fs(2)=2; 
so fs has no roots in Zs, hence no linear factors, and therefore is irre- 
ducible. By the above corollary, f is also irreducible. 
e Let f(z) = 32° + 1124 — 1223 + 6z — 21 and try m = 2. We have 


folz) = 2? +2441 


and f2(0) = fo(1) = 1. So fo has no linear factors, and nor does f. 
Being of degree 5, however, f2 could be the product of a quadratic and 
a cubic; in fact, it is not too hard to discover that 


falz) = (2? +24+1)(22 +241). 


Thus f2 is reducible; but note that the converse of the above result is 
not true, and so it does not apply to this example. 


e Try the previous example with m = 5. We know from our previous at- 
tempt that f has no linear factors, and so the only potential factorisation 
that we need to investigate modulo 5 is 


fo(z) = 32° + 244322 +24+4= (az? + bz+c)(dz* +e27 + fz+g). 
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We may assume (why?) that a = 1 and d = 3; multiplying out and 
equating coefficients gives the simultaneous equations 


1=3b+e, 3=f+be+3c, O=g+bf+ce, l=bg+cf, 4=cg 


in Zs. As each coefficient takes one of only five possible values it is not 
excruciatingly difficult to try them all. 


o If b= 1 the first three equations give e = 3, f = 2c and g = 0. But 
then the last equation says 4 = 0, which is obviously impossible. 

o If b= 2 we obtain e = 0, f =3+2c and g =4-+c. Now the final 
equation gives 


A=eg=4e+ => 8 4+4ce4+4=3 => (c+2)?=3; 


but this is impossible as the squares modulo 5 are 0, | and 4 only. 


Exercise. Rule out the remaining cases and hence prove that f is 
irreducible. 


e The polynomial f(z) = 22?7+3z+1 is reducible over Z; but fo(z) = z+1 
is irreducible. This shows that we cannot neglect the requirement that 
m be not a factor of the leading coefficient of f in the above result. 


Comments. 


e The integers modulo a prime form a field, and therefore have a “nicer” 
arithmetic than the integers to a composite modulus. For this reason it 
is customary (though not obligatory) to take m prime in the above type 
of problem. However, sometimes a composite modulus will work when 
prime moduli don’t — see exercise 3.13. 


e In a composite modulus the Factor Theorem for polynomials must be 
used with caution. The theorem as usually stated is still true — 


“q is a root of f if and only if z — @ is a factor of f(z)” 


— but if f has no roots it still may have a linear factor az + b with, 
necessarily, a # 1. For a specific example, consider the polynomial 
f(z) = 327 + 3z 4 2. It is easy to see that for any integer a, we have 
f(a) = 2 (mod 6), and so fg has no roots in Zg; but we cannot conclude 
from this that f¢ is irreducible, and in fact fe(z) = (82 + 1)(8z + 2). 


Lemma 3.7. Denominator of an algebraic number. [fa is an algebraic num- 
ber, then there is a rational integer d £ 0 such that da is an algebraic integer. 


Proof. Suppose that 


n—1 


—2 
An” + An—10” " + an—20" “++--+aa+a9=0, 
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where each a,x is an integer and a, 4 0. Choose d = ay: multiplying both 


sides of the above equation by a”~! and rearranging, we have 


(an@)” + Gn—1 (Gna) + +@n—20n (ana) * +---+a,a" 7(a,a)+apa™ 1 =0. 


Thus a,q@ is a root of a monic polynomial with integral coefficients, and by 
Corollary 3.3 is an algebraic integer. 


Definition 3.3. For any algebraic number a, the smallest positive integer d 
such that da is an algebraic integer is called the denominator of a and is 
denoted dena. 


Examples. 


e If ~=p/q is a rational number, with p and g having no common factor 
and q > 0, then dena = q. That is, dena is just the denominator, in 
the usual sense, of a fraction. 


e Let a = cos 47. Then 8a — 4a? — 4a + 1 = 0, as on page 33, and the 
proof of the lemma shows that 8a is an algebraic integer. However, 8 is 
not the smallest integer with this property, since 


(2a)? — (2a)? — 22a) +1=0. 


It is not hard to see that a itself is not an algebraic integer; therefore 
d = 1 is not possible and we have dena = 2. 


3.1.2 Closure properties of algebraic numbers 


Next we shall sketch proofs that the set of (complex) algebraic numbers forms 
a subfield of C, and that the algebraic integers form an integral domain. These 
proofs require a certain acquaintance with basic properties of vector spaces and 
abelian groups; however, the level required is probably too much to summarise 
in an appendix. Therefore, on this occasion only, we invite the interested reader 
to refer to other sources for background material. Two of many possibilities 
are Axler [8] for linear algebra, Stewart and Tall [62] for groups. The reader 
who prefers to continue with the main topics of this book can safely proceed 
to section 3.2 after noting carefully the results of Theorem 3.10, Corollary 3.11 
and Theorem 3.12. 


Lemma 3.8. Let S={a,|k eK} be a set of complex numbers. Then 


e the set of linear combinations 


s TRAk 


with finitely many terms and rational coefficients ry, is a vector space 


over the field Q; 
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e the set of linear combinations 


) MEAk 


with finitely many terms and integer coefficients mz, is an abelian group 
under addition. 


Lemma 3.9. Finiteness criteria for algebraic numbers. Let a € C; in the 
previous lemma take S = {1,a,a7,...}. Then 


e a is algebraic if and only if the vector space of rational linear combina- 
tions of S is finite-dimensional; 


e a is an algebraic integer if and only if the group of integer linear com- 
binations of S is finitely generated. 


Sketch of proof. If a is algebraic of degree n, then all powers of a can be 
written as linear combinations of { 1,a,a?,...,a"~! }, so the vector space has 
a spanning set (in fact, a basis) with n elements, and so is finite-dimensional. 
Conversely, if the vector space has dimension n, then {1,a,a7,...,a"} isa 
linearly dependent set, and this yields a polynomial identity satisfied by a. 


If a is an algebraic integer of degree n, then every power of a can be 
written as an integral linear combination of { 1,a,a7,...,a"~1}, and so this 
set generates the group. Conversely, suppose that the group is generated by 
n elements p1,p2,.-.,Pn. Each of these is an integer linear combination of 
powers of a; therefore so are ap), ap2,...,Q@Pn, and we can write equations 


Ape = MP1 + MpQp2 +++ +MenPn for k=1,2,...,n. 


Transferring all the terms m;;p; to the left-hand side gives a homogeneous 
system of linear equations with a non—zero solution; therefore the determinant 


a—™11 —m 12 nee —™Min 
—m™a21 a— m2 "7° —™Me2n 
Mn —™Mn2 pone A—- Mnrn 


is zero. Expanding the determinant gives a monic polynomial in a with integral 
coefficients. 


Theorem 3.10. Sums, differences and products of algebraic numbers. Jf a 
and 8 are algebraic, then so area+ 6 and af. If a and £ are algebraic 
integers, then so area+ 6 and af. 


Sketch of proof. This is an application of the previous lemma. If a is al- 
gebraic of degree m, then every power of a can be expressed in terms of the 
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powers 1,a,a?,...,a’~!; a similar observation holds for an algebraic number 


B of degree n. But every power of a+ ( can be written as 


(a+ p)* = ‘: (5) yeu 


j=0 


expressing each a and 6*~J in terms of the lowest possible powers, expanding 
and collecting terms gives a result of the form 


m—-1n-1 


(a eB? _ S. cs rija' PB ri 


i=0 j=0 


Therefore, the vector space generated by the powers of a+ @ is also spanned 
by the finite set {a’6J |0<i<m,0<j <n}, and soa+Q is algebraic. 
The proofs of all the other assertions are similar. 


Comment. The above proof also shows that the degrees of a+ 6 and of a 
are at most the degree of a times the degree of (3. 


Corollary 3.11. Quotients of algebraic numbers. If a and 6 are algebraic 
and 8 £0, then a/8 is algebraic. 


Proof. All we need show is that a non-zero algebraic number has an algebraic 
reciprocal. But if 3 is a root of 


baz” + bye” tb +e + bz + bo ; 
then 8—! is a root of 

Bg ae bye”? es he 4+ Oy, 
this proves the result. 


A result that we mentioned in Chapter 1 is proved by methods similar to 
those of the last theorem. 


Theorem 3.12. Polynomials with algebraic coefficients. If the complex num- 
ber 8 is a root of the polynomial 


An 2” + Ano) +e Faye +a0 (3.2) 


with algebraic coefficients and a, #0, then 2 is algebraic. If 6 is a root of the 
monic polynomial 
2 4+ ane” 1 +---tarz+a0 


whose coefficients are algebraic integers, then B is an algebraic integer. 
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Sketch of proof. Since a, # 0 and we know that quotients of algebraic 
numbers are algebraic, we may assume that a, = 1 in (3.2). Then every 
power of 6 can be written as a rational linear combination of expressions 


aay ames (3.3) 
with 0 < m < n. But by assumption, each a, is algebraic of degree (say) 
d;, and so can be written as a linear combination of 1, ax, a%,... a, 
Substituting these linear combinations into the linear combination (3.3) and 


expanding shows that every power of (3 is a linear combination of expressions 
like (3.3) in which the exponents satisfy 


O<m,< dz, forallk and O0O<m<n. 


There are only finitely many such expressions (in fact, at most dod; ---dn—1n), 
so the vector space spanned by the powers of £ is finite-dimensional, and (6 
is algebraic. 


To prove the assertion involving algebraic integers, just observe that all 
the linear combinations and all the expansions we have considered above will 
have integral coefficients. 


3.2 EXISTENCE OF TRANSCENDENTAL NUMBERS 


The first question we need to address about transcendental numbers is whether 
or not there are any! It is clear that algebraic numbers exist: for a start, all 
rational numbers are algebraic, and we have also given a few examples of 
irrational algebraic numbers. However, it is conceivable that every complex 
number could be a root of a rational polynomial, in which case transcendental 
numbers would not exist. 


Notice, by the way, that we have so far only seen algebraic numbers of 
degree up to 4. It is not at all clear that algebraic numbers of arbitrarily high 
degree exist. If, for example, we were to consider polynomials with real (rather 
than rational) coefficients, then there would be no irreducible polynomials of 
degree greater than 2. The situation in this case would therefore be very 
simple: all real numbers would be algebraic (over R) of degree 1, and all non— 
real complex numbers would be algebraic (over R) of degree 2. Among the 
complex numbers there would be no algebraic numbers of higher degree, and 
no transcendental numbers. 


The existence of transcendental numbers was first proved by Joseph Liou- 
ville, who attempted to show that e is not an algebraic number. He failed in 
this aim but achieved enough to allow him in 1844 (and again, using different 
techniques, in 1851) to give specific examples of transcendental numbers. A 
completely different proof was given three decades later by Georg Cantor: a 
proof which is perhaps simpler, though, as it does not provide any specific ex- 
amples of transcendentals, possibly somehow beside the point as far as number 
theory is concerned. We shall begin with Cantor’s proof. 
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Cantor proved the existence of transcendental numbers simply by showing 
that there are, in a sense, more complex numbers than algebraic numbers. 
Specifically, the set of complex numbers is uncountable — this follows immedi- 
ately from the uncountability of the reals, proved by Cantor in 1874 — while, 
as we shall now show, the set of (complex) algebraic numbers is countable. 


First, a slightly informal proof. Recall that an algebraic number is, (almost) 
by definition, a root of a non-zero polynomial with integral coefficients. Define 
the height of any such polynomial to be the maximum of the absolute values 
of its coefficients: that is, if f(z) = a@nz" + an_1z"-! +--+ +a1z2 +4 with all 
ay, integers and a, 4 0, then 


H(f) = max(|an|, |an—1|,---, lar], |aol) - 


The height of any polynomial f 4 0 is a positive integer. Let m be a positive 
integer and consider all polynomials f with deg f + H(f) =m. Any such poly- 
nomial has degree less than m and therefore at most m non~zero coefficients; 
each of these coefficients is an integer from —m to m. Therefore, the number 
of f satisfying deg f + H(f) = m is at most (2m +1)” and is hence finite. 
So we can construct a list of all non-zero polynomials over Z by writing down 
those with degree plus height equal to 1, followed by those with degree plus 
height equal to 2, and so on. We can now write down all the roots of the first 
polynomial in our list, followed by all the roots of the second, and so forth; 
deleting any repetitions in this list, we have listed all (real and complex) alge- 
braic numbers once each. Since the set of algebraic numbers can be arranged 
in a list it is countable, and therefore cannot include all complex numbers. 
Thus transcendental numbers exist. 


Theorem 3.13. Transcendental numbers exist. 


Proof. We give a more formal version of the proof outlined in the previous 
paragraph. See appendix 1 at the end of this chapter for basic results on 
countability. Let Z[z] be the set of polynomials in z with integral coefficients, 
and let S be the set of finite sequences of integers. Since Z is countable, S' is 
also countable. It is easy to see that, taking a, 4 0 as usual, 


-1 
An2” + An—12" * +++ az +9 > (Gn, An—1,--+,@1, 40) 


defines a one-to-one function from Z[z] to S, and therefore Z[z] is countable. 
For each polynomial f in Z[z] let Sy be the set of roots of f. Then 


{ algebraic numbers } = U Sy 
fEZ{z] 
is countable by a result from the appendix. Since C is not countable, tran- 
scendental numbers exist. 


Comment. According to the intuitionist school in the philosophy of math- 
ematics (originating with L.E.J. Brouwer, 1881-1966), an existence proof is 
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not valid unless it explicitly provides an algorithm for the construction of the 
object whose existence is asserted. In particular, the above proof would not 
be accepted by those who adopt this stance. The same objection is made to 
the e—d style of proof in elementary analysis; intuitionists have to reformulate 
definitions and then either re—prove or abandon results concerning limiting 
processes. A great deal of work has been done on this task; for an intro- 
duction see Kérner [36], Chapters VI and VII, or for basic calculus from an 
intuitionist point of view try Bishop [15]. 


3.3 APPROXIMATION OF REAL NUMBERS BY RATIONALS 


Instead of relying on Cantor’s countability argu- 
ment we can go back to Liouville’s earlier proofs. These 
make it possible to explicitly construct transcendental 
numbers and are therefore, from the number-theoretic 
point of view, more interesting than Cantor’s proof. 
They also avoid the intuitionist objections mentioned 
above — though perhaps not completely so, as they do 
make use of the Mean Value Theorem from elementary 


calculus. Joseph Liouville 
(1809-1882) 


Liouville’s methods derive from an investigation of 
the problem of approximating real numbers by rationals. Let a € R; 
we wish to ask how closely a can be approximated by rational numbers p/q. 
That is, we want to know how small 


(3.4) 


can be made by a suitable choice of the rational p/g. Unfortunately, this 
problem is too easy to be of any interest: as the rationals are dense in R, the 
difference (3.4), for any a, can be made as small as desired by choosing a large 
value of q and an appropriate p. Specifically, if we want the difference to be 
smaller than a positive number ¢, we choose q > 1/2e and let p be the closest 
integer to ga. Then 


1 
laa—plS5 > <m—<e. (3.5) 


This observation, though not very interesting in itself, may suggest a more 
significant approach, namely, to insist that the closeness of approximation 
should depend on the denominator of the approximating fraction. In other 
words, we shall be interested in a fairly weak approximation if it is given by a 
fraction with very small denominator, whereas if the denominator is large we 
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shall expect the approximation to be exceptionally close. One way to achieve 
this is to try to solve an inequality such as 


1 


go 


where a is a given real number and we seek rational p/q. In this case, if we are 
forced to choose a large value of g, we do at least know that the approximation 
is much closer than we had previously with 


To obtain worthwhile results we need to note two more points. A single 
solution of either of the above inequalities is of little importance as it does 
not give rational numbers arbitrarily close to a: what we really want is that 
there be infinitely many solutions to such an inequality. Secondly, we would 
like the right-hand side of the inequality to be uniquely determined by the 
approximating fraction p/q; therefore we shall require that gq be the “true” 
denominator of the fraction, that is, that p and q have no common factor. As 
a result of these considerations, we introduce the following terminology. 


Definition 3.4. A real number a is said to be approximable (by rationals) 
to order s if there exists a constant c such that the inequality 


6 
¢ 


(3.6) 


is satisfied by infinitely rational numbers p/q with p,q relatively prime. 


Note. 


e In this definition s may be any positive real number. In practice s will 
frequently, though not always, be an integer. 

e The approximations to @ given by (3.6) are very close when s is large, 
less so when s is small. We shall say that a@ is well approximable by 
rationals if it is approximable to a high order, and poorly approximable 
if it is approximable only to a low order. 

e It is clear that if @ is approximable to order s then it is approximable 
to any order less than s. So the basic problem in this topic is to find the 
greatest possible order of approximation for a given real number. 


Example. Consider the number 


a= ys 10-2" = 0.1101000100000001000- - - j 
k=0 
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which we proved irrational in Chapter 1. We observe from the decimal that 
the rational approximation obtained by taking, say, the first eight decimal 
places is actually accurate to fifteen decimal places, and therefore is much 
better than we might expect. If for any m we choose integers 


m 
q=10?" and p=q> "10, 
k=0 
then p,q are coprime (look at the last term in the sum) and 


Ot ee 
q an 192+ 192+? 1927+ = @2 ° 


Since m is arbitrary, we can find infinitely many rationals p/q with this prop- 


erty, and so a is approximable to order 2. 


There are various slightly different ways to formulate the statement that 
a number is, or is not, approximable to a given order. A lemma will be useful. 


Lemma 3.14. Given a real number a, let c be a real constant, s and Q 
positive real numbers. Then the inequalities 


c 
a-B<s and 0<q<Q 
qd qd 


are satisfied simultaneously by only a finite number of pairs (p,q) of integers. 


Proof. Let (p,q) satisfy the inequalities. Clearly there are only finitely many 
possible values for g, and since 


Cc 
10 ~ Get SP Sar at 


each of these yields only finitely many p. This proves the lemma. 


Theorem 3.15. Alternative definitions. Let a be a real number and s a pos- 
itive real number. Then the following are equivalent: 

e a is approzimable to order s; 

e there exists a constant c such that the inequality 


Pp 
a--— 
q 


0< <— (3.7) 


(i 


| c 


is satisfied by pairs (p,q) with arbitrarily large q; 


e there exists a constant c such that (3.7) is satisfied by infinitely many 
pairs of integers (p,q). 
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Proof. Suppose that a is approximable to order s, so that infinitely many pairs 
(p,q) of coprime integers satisfy (3.6). By Lemma 3.14, only finitely many of 
these satisfy q < Q, where Q is any given bound. So (3.6) has solutions (p, q) 
with arbitrarily large g. The equality |a — p/q| = 0 holds for only one of these 
pairs — indeed, for none of them if a is irrational — and so pairs with arbitrarily 
large q satisfy (3.7). This shows that the first statement implies the second. 


It is clear that the second statement implies the third; now we show that 
the third implies the first. If (mp, mq) is a solution of (3.7), then so is (p,q), 
for we have 


O< 


a- 2) —|a FG Se 
(mq)* ~ q 


Therefore, if (3.7) has infinitely many solutions but (3.6) has only finitely 
many, then there would be a pair (p,q) such that (mp, mq) satisfies (3.7) for 
infinitely many m. Then we should have 


mp Cc c 
s 


G 
a- 2 < 
qd 


O< 


ms q° 


for arbitrarily large m; but this is impossible, as |a — p/q| is a fixed real 
number, while c/m*q* tends to zero as m — oo. This shows that the third 
statement implies the first, and completes the proof. 


Comment. We need to include the left-hand inequality in (3.7) since we 
are now speaking not of rational numbers but of pairs of integers, and have 
dropped the requirement that p,q be coprime. This means that if a = a/b is 
rational, the inequality 


Cc 
a- 2 < 3 
qd qd 


would be trivially satisfied by the infinitely many pairs (p,q) = (ma, mb). 


Theorem 3.16. A non-approximability criterion. Let a be a real number and 
t a positive real number. Suppose that there exists a positive constant c such 


that 

a= 7 2 
q 

for all rational numbers p/q 4 a. Then a is not approximable to any order 


greater than t. 


Qo 


Proof. If the given assumption holds and a is approximable to order s > t 
then, using the second part of Theorem 3.15, there is another constant c’ such 
that the inequalities 


< 


/ 
c c 
aa z Se 
q q q 
are satisfied by certain rationals p/q with arbitrarily large g. But the in- 
equalities yield g°~* < c’/c; as the exponent s — ¢ is positive, we obtain a 
contradiction by choosing q sufficiently large. 
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We have now seen enough generalities concerning approximability to begin 
looking at some specific cases. First, the reasoning which led to the inequal- 
ity (3.5) gives us a result valid for all real numbers. 


Theorem 3.17. Any real number is approximable to order 1, at least. 
Proof. Take, say, c = 1 and for any q let p be the closest integer to ga. Then 


a— 1 c 
Job) Ment ce, 
q q 2g 4 
If q@ is irrational, then the left-hand side is non~zero for all q; if @ is rational, 
it is non-zero for all g except multiples of the denominator of a; in either 


case (3.7), with ¢ = 1, is true for infinitely many pairs (p,q), and so a is 
approximable to order 1. 


Next we consider the case a € Q. Of course such an a can be “approxi- 
mated” precisely by one rational number, namely, a itself. However, our def- 
inition of approximability demands that there be infinitely many distinct ra- 
tionals close to a. In this sense rational numbers are only very weakly approx- 
imable. 


Lemma 3.18. Let a be rational. Then there exists a constant c > 0 such that 

for every rational p/q 4 a we have 

a- 2 > 
qd 


Proof. Let a = a/b. If p/q 4 a/b, then aq — pb is a non-zero integer and so 


ot) te 
q bq bq 


Taking c = 1/b, the lemma is proved. 
Comments. 


e We say that “there exists a constant c” in order to get used to the style 
of future proofs. Although in the present case, we have given an explicit 
formula for c, to do so in more advanced results is inconvenient at best, 
sometimes difficult, sometimes even impossible. 


e We can use this lemma to prove the irrationality of e. For suppose that 
e € Q, and for any n > 1 choose 
al ils 1 1 1 
q=m, p=n TW igg o gp ag F 


Then there is a positive constant c such that for all n, we have 


oi! @sal° °° Gay n ng’ 


, 1 1 1 ntl 1 
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but this cannot be true when n > 1/c. Essentially, this is the same as 
the proof we gave in Chapter 1. 


Theorem 3.19. A rational number is not approximable to any order s > 1. 


Proof. This is an immediate consequence of the previous lemma and the 
non-approximability criterion, Theorem 3.16. However, we shall repeat the 
argument for the sake of clarity. Suppose, then, that a is approximable to 
order s > 1. By using the above lemma and one of our equivalent definitions 
of approximability, we have 


Soa — 


q 


2 dc 


Q190 


for some positive constants c,c’ and for certain p/q with arbitrarily large gq. 
But the inequality implies q*~! < c'/c with s — 1 > 0, and for large q this is 
impossible. 


Example. The irrationality of 


a= »; 10-2" = 0.1101000100000001000- - - 
k=0 


follows immediately, since we showed on page 43 that a is approximable to 
order 2. Observe that although the decimal expansion of @ provides the mo- 
tivation for the irrationality proof, the actual details are entirely independent 
of the decimal. Therefore, this proof is quite different from the one we gave in 
Chapter 1. 


Theorem 3.19 is entirely characteristic of rational numbers; irrational num- 
bers, on the other hand, are approximable to order 2 or more. The proof of 
the next lemma is a beautiful and surprising application of the pigeonhole 
principle. 


Lemma 3.20. Let a € R and let Q be a positive integer. Then there is a 
rational number p/q in lowest terms, with denominator at most Q, such that 


Proof. Divide the interval [0,1) into Q subintervals 


1 12 Cad 
los). | Q a) 


and consider the fractional parts of the Q+1 numbers 0, a, 2a,...,Qa. These 
fractional parts all lie in the interval [0,1), and so two of them must lie in 
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the same one of the Q subintervals mentioned above. Therefore, there exist 
integers q; and q2 with 0 < q@ <q@<Q and 


| (qaa — |qoa]) — (ma— |qa}) | < 5 . 


Taking g = gg — qi and p = |q.a| — |qia|, we have 0 < q < Q and 
1 


jga—pl<a- 


If we divide out any common factor of p,q the inequality remains true, and 
the lemma follows. 


Theorem 3.21. Approximability of irrational numbers. Any irrational real 
number is approximable to order 2, at least. 


Proof. Let a be irrational. Take any Q = Q, > 0 and choose a rational 
number p;/qi satisfying the lemma. Then 
1 1 


< => 
Qin q 


- 


| Pl 
a- = 
qq 


and the left-hand side is not zero since a ¢ Q. Now take 


1 
O< fix <5 
q2 93 
moreover, 
1 
oe! <> Ss es 
qe Qe un 


and so p2/q2 is not equal to p;/q,. Continuing in this way we find infinitely 
many solutions of the inequality (3.7) with s = 2. 


Just as rational numbers are distinguished from irrationals in being ap- 
proximable to no greater order than 1, so, as it turns out, algebraic numbers 
are distinguished from (some) transcendentals in being approximable to no 
greater order than 2. The following result was proved by K.F. Roth in 1955, 
and earned him the 1958 Fields medal. 


Theorem 3.22. Roth’s Theorem [56]. A real algebraic number is not approz- 
imable to any order greater than 2. 
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Roth’s Theorem was the culmination of a series of results on approximability 
properties of algebraic numbers, of which the most important are the following. 
If a is an algebraic number of degree n, then 


e Liouville [39] proved in 1851 that a is not approximable to order greater 
than n; 


e in 1908-9 Thue [63], [64] improved Liouville’s result by showing that a 
is not approximable to order greater than an +1; 


e Siegel [59] showed in 1921 that a is not approxi- 
mable to order greater than 2,\/n. 


Since we have already shown that every irrational num- 
ber is approximable to order 2, Roth’s Theorem is the 
best possible result of its type. We shall prove Liou- 
ville’s result, which suffices to establish the transcen- 
dence of certain interesting real numbers, but shall pass 
over with brief comments the results of Thue, Siegel 
and Roth. (4926-2015) 


Theorem 3.23. (Liouville, 1851). If a is a real algebraic number of degree 
n, then a is not approzimable to any order exceeding n. 


Proof. If a is rational, then a has degree 1, and we have already proved 
the result (Theorem 3.19). So let @ be an irrational algebraic number with 
minimal polynomial f and degree n > 2. For any real xz, the Mean Value 
Theorem states that 

f(x) — f(a) 


_ fl 
fog 

for some y between a and a. Taking x = p/q, rearranging and recalling that 
f(a) = 0 by definition, we have 


1(2) =—f'() (0 - 2) (3.8) 


We wish to obtain a lower estimate for g"|a — p/q|, so that we can establish 
Liouville’s result by appealing to the non—approximability criterion, Theo- 
rem 3.16. Note, however, that the constant c in that theorem is independent 
of p/q; here y does depend on p/g and therefore must not appear in our es- 
timate. First, suppose that |a — p/q| < 1. Thena—1< y< a+l1. Since 
f’ is continuous on R it is bounded on any finite interval and we have, say, 
|f’(y)| < pw. From the equality (3.8) we obtain 


u()|= 


A 
QS =! 3 
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Now let 6 be a common denominator for the coefficients of f. Then bf (x) is 
a polynomial with integer coefficients and so bq” f(p/q) is an integer, which, 
moreover, is not zero since f is irreducible and has no rational roots. Therefore 


-aeth@lene 
q| be q bug” 


provided that |a—p/q| < 1. On the other hand, if |a—p/q| > 1 then obviously 


So take c = min(1,1/by). Then since c does not depend on p/q, we have for 

every rational number 

alae =, (3.9) 
q| 


and by the non—approximability criterion, this shows that a@ is not approx- 
imable to order greater than n. 


; | 


At last we can give a specific example of a transcendental number. 


Theorem 3.24. Liouville’s number. The number 


A= x 1o-= 
k=1 


is transcendental. 


Proof. Suppose, on the contrary, that » is algebraic of degree n. For any 
integer m > n define 


q=10™ and p= a>) i .. 
k=1 


Then p and q are coprime integers and we have 


,—Fl a 1 1 2 _ 2 2 2 
~ Gl ~ Tom! 7 pmaar TS [oceepr + gest S peat 

Since this holds for all m > n, there are infinitely many rational p/q in lowest 

terms satisfying 

— Pp <= = 

q| @ 

with c= 2 and s=n+1, and so J is approximable to order n+ 1. But this 

contradicts Liouville’s Theorem, and it follows that » is transcendental. 


Definition 3.5. A real number which is approximable to arbitrarily high order 
is called a Liouville number. 
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It follows immediately from Theorem 3.23 that every Liouville number is 
transcendental. The converse, however, is not true. Consider once again 


a= x 10-2" = 0.1101000100000001000:-- . 
k=0 


By truncating the decimal after each 1 (the same idea, in fact, as we used 
for A) we showed that a@ is approximable to order 2. On the other hand, it 
is difficult to think of any way of obtaining better rational approximations 
than these; this suggests that @ is not approximable to any order greater 
than 2, and so is not a Liouville number. However, a is transcendental, as we 
shall show in Chapter 6. As an example of a transcendental number which 
is (demonstrably!) not a Liouville number, we need only consider 7. Kurt 
Mahler [42] proved in 1953 that 


for all rational numbers p/q with gq > 2; this result has been improved by var- 
ious authors, V.Kh. Salikhov [57, 58] showing in 2008 that there is a constant 
do such that 

1 
al? am 
for all p/q with q > qo. Therefore, 7 is not approximable to order greater than 
7.61. Either of these results shows that 7 is not a Liouville number. However, 
m is transcendental: a proof of this will be given in Chapter 5 by extending 
Hermite’s methods from Chapter 2. 


T — 


In 2020 Zeilberger and Zudilin improved upon some of the details in Sa- 
likhov’s argument, thereby marginally reducing the exponent 7.61 to 7.11. At 
the time of writing, this is the record in what the authors [71] refer to as a 
“competitive sport”. 

The exponential constant e is another well-known number which is tran- 
scendental (see Chapter 5) but not Liouville (exercise 4.20). Some very dif- 
ferent examples were obtained by Mahler [41] in 1937: he showed that the 
members of a class of real numbers represented by decimal expansions such 
as 


0.12345678910111213141516--- and 0.149162536496481100121144.--- 


— the former is the Champernowne constant mentioned in Chapter 1 — are 
transcendental but are not Liouville numbers. In Chapter 4 we shall use con- 
tinued fractions to give a complete proof of the existence of non—Liouville 
transcendental numbers. 


Note the importance, in Liouville’s proof, of ensuring that the constant c in 
the inequality (3.9) is independent of p/q. This is necessary since the inequality 
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in Theorem 3.16 must hold for a fixed c and all rationals p/q except a. If c is 
allowed to depend on p/gq, the inequality can be satisfied trivially, regardless 
of the approximability properties of a. In Liouville’s proof, y lies between a 
and p/q. Therefore, although no simple functional relationship has been given, 
+ does depend on p/q, and so c must not depend directly on 7. To overcome 
this problem we replace | f’(y)| by 4, the maximum absolute value of f’ over 
the interval [a — 1,a@ + 1], which is independent of p/g. 


On the other hand, it is permissible for c to depend on a and on s. Consid- 
ering Liouville’s proof again, the algebraic number a completely determines 
the polynomial f and hence the common denominator 6; the derivative f’ 
therefore is also uniquely determined by a, and so is yu. Hence c = min(1, 1/bu) 
defines c as a positive real number which depends on a but not on any par- 
ticular rational p/q. 


Similar considerations apply if we wish to show from the definition that a 
real number a is approximable to order s: the constant c in the inequality (3.6) 
appearing in the definition must be the same for all solutions p/g. This is 
obviously so in, for example, the proof that every irrational is approximable 
to order 2, where we just took c= 1. 


Liouville’s Theorem can be generalised in various ways. First, we rephrase 
the result (3.9) that we obtained on page 50, near the end of Liouville’s proof. 


Theorem 3.25. Let a be an algebraic number of degree n. Then there is a 
constant c, depending only on a, such that for all integers p and all positive 


integers q we have 
c 
| qa — p| 2 qr ‘ 


provided that qa—p#0. 


This can be regarded as a result about the size of g(a), where g(z) = qz—p is 
a linear polynomial with integer coefficients. We may generalise the result by 
removing the restriction that g be linear. As on page 41, let the height of any 
polynomial g € Z[z] be the maximum of the absolute values of the coefficients 
of g. That is, if g(z) = amz™ + Qm—1z™ 1 +-++-+a,z+ a9 then 


H(g) = max, lan 


We have the following result. 


Theorem 3.26. Let a be an algebraic number of degree n. Then there is a 
positive constant c, depending only on a, such that for all polynomials g of 
degree m, with integer coefficients, we have 


|9(@)|2 ager» 


provided that g(a) £0. 


Algebraic and Transcendental Numbers @ 53 


Proof. Let the conjugates of a be a1, a2,...,Q@, with a = aj, and let a be 
the maximum of the absolute values of these conjugates, 


a= max ja,| . 
1<k<n 


From the definition of H(g), we have 
|9(ax)| < [gmllenl"™ + [gm—r|lan| > +--+ + |gillee| + Igo 
< H(g)\(a™ +a 1 +---+a+1) 
< H(g)(a+1)™ 


for each k. Let d be a common denominator for a1,Q2,...,Q,, that is, a 
positive rational integer such that da, is an algebraic integer for every k. 
Then 


d™ (ax) = gm(dax)™ + dgm—1(daz)™* +--+» +d™~191 (daz) + d™ go 


is a non—zero algebraic integer for every k. It follows that 
N = [[ a” 9(a%) (3.10) 
k=1 


is a non~zero algebraic integer. By using properties of symmetric polynomials 
— see Chapter 5 — it is possible to show that the product on the right-hand 
side of (3.10) is rational; consequently N is a rational integer and has absolute 
value at least 1. Splitting the first term off from the product and rearranging 
the powers of d, we have 


ad” |g(a)| T]lg(ox)| 21. 
k=2 


Now for k = 2,3,...,n, the estimate for |g(aj,)| obtained above yields 
d™” |g(a)| H(g)"* (a+r 21; 


h 
choose 1 


© a (a+ 1° 
Then c depends on n,a and d, all of which are uniquely determined by a. 
Therefore, c depends on a@ alone, and since 


| g(a) | 


IV 


the result is proved. 


Another way to generalise Liouville’s Theorem is to regard it as provid- 
ing information about the approximation of a fixed algebraic number a by 
algebraic numbers of degree 1. The following result, which is proved in Lip- 
man [40], concerns the approximation of a given algebraic number by other 
algebraic numbers. 
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Theorem 3.27. Let a be an algebraic number of degree n. Then there is a 
positive constant c, depending only on a, such that for all algebraic numbers 


€ of degree m we have 
c™ 


_ > —— 
e812 Bee 
provided that € # a. Here the algebraic number € satisfies a unique polynomial 
g of degree m having relatively prime integer coefficients, and the height H(£) 
is defined to be the height of g, in the sense given on page 41. 


We complete this section with a brief description of the methods used 
by Thue and Roth to improve Liouville’s Theorem. Thue’s result that a real 
algebraic number of degree n is not approximable to order exceeding $n +1 
is proved by contradiction. Assume that a has degree n and is approximable 
to order s > an +1. That is, there is a constant c such that the inequality 


0<la—-Hl}<— (3.11) 


qs 


tl << 


has infinitely many solutions. Choose a solution p;/qi with q, very large. Let 
m be a large integer, and choose another solution p2/q2 of (3.11), taking q2 so 
large by comparison with q, that q2 qj’. Thue then shows that it is possible 
to find a polynomial g(x, y) in two variables, with integral coefficients, having 
the following properties: (i) g(a, a) = 0; (ii) the partial derivatives 0%g/Ox", 
evaluated at (a@,a@), are zero for many values of k; (iii) the coefficients of g 
are “not too large”; (iv) g(p1/q1,p2/q2) is not zero. In fact, if g, and go are 
polynomials then 


g(x,y) = (y— a)gi(a,y) + (x — a)" go(a, a) 


clearly satisfies condition (i), and because of the factor (a — a)”, condition 
(ii) is a restriction only on gi, and not on gg. Thue proved the existence of g1 
so that (ii) is satisfied, and of gz so that (iii) and (iv) are then also satisfied — 
this is the difficult bit. Once g has been found, the proof proceeds in a manner 
similar to that of Liouville. 


Siegel showed that an algebraic number is not approximable to order 
greater than 2n!/?. Assuming that inequality (3.11) has infinitely many solu- 
tions for some s > 2n!/?, Siegel chose two solutions with large denominators 
(as Thue had done), and obtained a contradiction by refining certain details 
in Thue’s argument. The belief arose that for s > c:n'/*, one should be able 
to choose & solutions of (3.11) and then use polynomials in k variables to ob- 
tain a contradiction. If this could be done for arbitrarily large k, the greatest 
possible order of approximation s would simply be a constant, irrespective of 
the degree of a. The construction of the required polynomial turned out to be 
very difficult but was eventually accomplished in 1955 by K.F. Roth. 


Theorem 3.22. Roth’s Theorem [56]. A real algebraic number is not approz- 
imable to any order greater than 2. 
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3.4 IRRATIONALITY OF ¢(3) : A SKETCH 


We conclude this chapter with a very brief summary of Apéry’s notoriously 
complex irrationality proof for ¢(3). Our only aim is to show how the argument 
is based fundamentally on the approximation ideas introduced in the previous 
section: specifically, Apéry showed that ¢(3) is approximable to order (just 
slightly) greater than 1, and therefore cannot be rational. The reader should 
not be deluded into believing that the arguments we have omitted are easy! — 
they most assuredly are not. More details (as well as an engaging account of 
the circumstances surrounding Apéry’s announcement of his result) may be 
found in [66]. Another, and possibly simpler, irrationality proof for ¢(3) was 
given by Beukers [14]. Although superficially Beukers’ approach appears quite 
different from Apéry’s, the author acknowledges a close connection between 
the two. 


So, we begin by recalling the definition 


1 I, ol 
3 = —- = 1 —_ — — te 
¢(3) py. tag ag ys © 
By intricate but essentially straightforward algebra we may obtain an alter- 
native summation formula 


5a (—-1)** S57 1 1 1 1 

ee —— — ~— rile pen ) 
(3) 3 n3 (7) 2\1% x 2 x6! 32x20 Bx 

The heart of Apéry’s argument consists of defining two “mysterious” sequences 
ay, and b, with the property that the quotients a,,/b,, form a sequence of very 
good rational approximations to ¢(3). Set 


34n3 — 51n? + 27n—5 =1)? 
ao 0, ay 6, an eS as _ as 
nr 
for n > 2, and 
34n3 — 51n? + 27n —5 —1)8 
bo 1, by 5, bn Scie A et Oe 
nr mr 


for n > 2. One observes that a, and b,, satisfy the same recurrence, and differ 
only in their respective initial conditions. Amazingly, despite the fractional 
coefficients in its recurrence, it can be shown that 6, is always an integer! 
This is not the case for a,,; however, it turns out that a, is a rational number 
whose denominator is a factor of 2L3, where L,, is the least common multiple 
of the integers 1,2,...,n. Apéry also proved that 


lim “ = ¢(3) . 


n> Co n 
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The above results about a, and b, are the most difficult aspects of Apéry’s 
argument. It is then relatively straightforward to prove that 


An ce 6 
(3)-t= Sa 
bn are k3bpbp—1 
from which it follows that 
an CL 
0 < ¢(3) — oe < RB 


for a certain constant c;. This does not show that ¢(3) is approximable to 
order 2, because a, is not an integer. But if we define 


PH 2h tay. h=2E tes 
then p, and q,, are integers and we have 


0<|c(a)- 2 < 


nm 


Our aim is to find a constant s > 1 such that 
(3.12) 


for infinitely many n. To achieve this we need to know something about the 
size of b,; and also of L,,, since that appears in the definition of g,. It can be 
shown that if 

A=17+12V2 


(not actually as mysterious as it seems!) then 


Dn 
A” /n3/2 


approaches a finite non-zero limit as nm — oo; hence there exist constants 
c2,c3 > 0 such that 

cor” c3A" 

n3/2 ns n3/2 ° 
The Prime Number Theorem, which gives an estimate for the number of 
primes not exceeding n, can be employed to show that 


Lge" 


for all sufficiently large n. (See appendix 3.3 for some details and further 
references.) A little easy algebra then shows that the inequality (3.12) is true 
for all sufficiently large n satisfying 


(270) 2 


MW \n 4B 
( ) 3 3-35. 
7) 
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Now the left-hand side is an exponential, the right-hand side is a power 
function; so the left-hand side is certainly the greater (for sufficiently large 
n), provided that it is an increasing exponential, that is, \? > (27A)*. This 
holds for some s > 1 if and only if A > 27; referring back to the value of A 
stated above, it is very easy to see that this is true, and the proof is essentially 
complete. Specifically, any s greater than 1 and less than 


2 log X 
—— = 1.033667--- 
log(27A) 
will do what we want; we have shown, therefore, that 
P C1 


0<|c(s) -% 2 


has infinitely many solutions, so ¢(3) is approximable to order 1.03 and cannot 
be rational. 


EXERCISES 


3.1 Find the minimal polynomial of a = af 2/3. 


3.2 Find the minimal polynomial of a = cos aT. What are the conjugates 
of cos 47 ? 


3.3 If ¢ = e?7/9 is a fifth root of unity, find the minimal polynomials of 
a=C+¢4 and B=¢+¢?. 


3.4. Prove that an2y + dn—12"-' + +--+ 4,2 + Qo is irreducible if and only 
if agpz” +ayz"-1 +--+ +an_12 + Gy is irreducible. 
3.5 Let n > 2 be an integer. 
(a) Show that tan(/n) is algebraic. 
(b) Find the minimal polynomial of tan(z/p) if p is prime and p > 2. 
(c) Show that if n is composite, then the degree of tan(7/n) is strictly 
less than n — 1. Find the minimal polynomial of tan(7/10). 


3.6 Suppose that a is a root of a polynomial a,z” + an—12z"!+---+a9 
having integral coefficients. Prove that the denominator of a is a factor 
of ap. 


3.7 Let 30 be the smallest angle of a 3-4-5 triangle. Show that a@ = cos @ is 
an algebraic number; find its degree, its denominator and its conjugates. 


3.8 Show that if a,b are positive rationals, a 4 b and a!/” — b!/” is rational, 
then a!/” and b!/” are both rational. 


This was set as a “challenge problem” in exercise 1.4 — congratulations 
to any reader who solved it! The problem should be somewhat more 
approachable using results we have seen in Chapter 3. 
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3.9 


3.10 


3.11 


3.12 


3.13 


3.14 


Prove that 1, 2, W/4 are linearly independent over Q. This problem 
appeared as exercise 1.6: it should be quite easy using methods of the 
present chapter. 


(a) Show that two conjugate algebraic numbers can never differ by a 
rational number. (Except for zero of course!) 


(b) Let @ be algebraic and not zero. Prove that if r is rational and ra 
is a conjugate of a, then r = +1; furthermore, if a has odd degree, 
then r = 1 only. 


Let the polynomial f(z) be irreducible over Q. Show that f(z) has no 
repeated factors over C. 


Comment. The terminology in use is that over Q, every irreducible 
polynomial is separable. This result does actually depend on specific 
properties of Q, and there exist cases in which an analogous result is 
not true. See [61], Chapter 8. 


Show that for any integer n > 2 and any prime number p, the polynomial 
f(z) =2" +2"! 4+ pis irreducible. 


In this exercise we investigate the irreducibility of f(z) = z4 +1. 


(a) Suppose that p is a prime and that there exists an integer a such 
that a? = 2 (mod p). Factorise f,(z) as a product of two quadratics 
over Zp. 


(b) Suppose that p is prime and there exists a with a? = —2 (mod p). 
Factorise f,(z) over Zp. 


(c) It can be shown (see, for example, Hardy and Wright [29], section 
6.7) that if p is prime and neither of the above conditions holds, 


then there exists a such that a? = —1 (mod p). Use this fact to 
complete the proof that if p is any prime, then f, is reducible over 
Zp. 


(d) Find a composite m such that fm is irreducible over Z,, and de- 
duce that f is irreducible over Z. 


(e) Alternatively, use Eisenstein’s criterion to show that f is irre- 
ducible over Z. 


(a) Prove the following extension of Eisenstein’s Lemma. Let 
f(2) = Anz" + rn ae +-++++4a12%+ a9 


be a polynomial with integral coefficients and let m be an integer, 
1<m<_n. Suppose there exists a prime p with p | ag, @1,..-,@m—1 
and p{ @m and p? { ag. Then f is not the product of two rational 
polynomials of degree less than m. 


(b) Deduce that f(z) = 24 + 23 + 227 + 6z + 2 is irreducible. 


3.15 


3.16 


3.17 


3.18 


3.19 


3.20 


3.21 


3.22 
3.23 
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Prove Westlund’s criteria for irreducibility [68]. Let a1,a2,...,a@n, be 
distinct integers. Then 

e f(z) = (z—a1)(z — ag) --- (2 — an) — 1 is irreducible; 

e g(z) = (z—a1)(z — ag) +++ (Z — Gy) +1 is irreducible unless it is a 

perfect square. 

Prove that if at least one of the numbers a and £ is transcendental, then 
at least one of the numbers a + ( and af is transcendental. 
Prove that if a@ is not approximable to order s, then there exists a positive 


real constant c such that the inequalities 


c 
a-f]< 5 
qd qd 


O0< 


have no rational solution p/g. 


Show that if a is approximable to order s, then a? is approximable to 
order 5/2. 
Show that if a and s are positive integers with a > 2 then 

co 

1 
= >) as” 

k=1 
is approximable to order s. Deduce from Roth’s Theorem that if s > 3 
then qa is transcendental. 
A simultaneous approximation problem. Prove that there is a positive 
integer k < 100 such that kyV/2 and kv are both integers to within 0.1. 
Let a be a natural number with a > 2, and let {by }?2, be a sequence 
of integers satisfying 

0<b < bg <b3<-:-. 

Show that if b,41/b, is unbounded then 

co 

1 
Oe Doe 

k=1 
is a Liouville number (and hence is transcendental). 
Prove that any real number is the sum of two Liouville numbers. 
Ruler—and-compass constructions. Two of the great unsolved problems 


of geometry in Ancient Greece were known as “duplicating the cube” 
and “trisecting the angle”. In each case the constructions were to be 


60 @ Irrationality and Transcendence in Number Theory 


3.24 


3.25 


3.26 


3.27 


accomplished without the use of any instruments other than a pair of 
compasses and an unmarked ruler. In the nineteenth century these prob- 
lems were approached from an algebraic point of view, and it was shown 
(see, for example, [61]) that, under the stated restrictions, line segments 
with lengths in the ratio a can be constructed only if a is an algebraic 
number whose degree is a power of 2. 


(a) The problem of duplicating the cube is: given a cube, construct 
a cube of twice its volume. Use the result mentioned above to 
show that this construction is impossible (by means of ruler and 
compass). 


(b) The problem of trisecting the angle is: given any angle, divide it 
into three equal angles. Explain why constructing an angle of size 0 
is equivalent to constructing line segments in the ratio cos @. Then 
suppose that cos @ is rational, and show that 6 can be trisected by 
ruler and compasses only if the polynomial f(z) = 423 — 3z—cos6 
is reducible. Finally, find an example of an angle which cannot be 
trisected with ruler and compasses. 


Find a set of points { P,, P2,...} in the real plane, as few as possible, 
such that every point in the plane is an irrational distance from at least 
one P,. (After [23].) 


Let a,b and c be positive integers, and let a be the real cube root of a/b. 
By considering rational approximations to a, prove that the equation 


az? — by? =c 
cannot have infinitely many solutions in integers x, y. 


Let p be prime, p > 3. Show that if a p-gon has all its angles equal and 
all its sides of integer length, then it is regular. 


Recall that the Champernowne constant is 
€ = 0.12345678910111213--- . 


The following working shows how to obtain one very good rational ap- 
proximation to €. Use similar ideas to find infinitely many very good 
rational approximations. What can you say about the transcendence, 
or, if algebraic, the degree of € by appealing to (a) Liouville’s Theorem; 
(b) Siegel’s Theorem; (c) Roth’s Theorem? 


We shall take advantage of the decimal expansion of € and related num- 
bers to find a very simple way of multiplying by 99. First, we write 


10°¢ = 123456789.101112---99100101---=pit+&i+n, 


3.28 
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where 
90 pairs 
ee 
pi = 123456789, €, —0.1011---99 
and 


180 digits 
aa. at —180 
ry, = 0.00---00100101--- < 10 : 
Next we have 
991 = 1001 — &1 
89 pairs 89 pairs 89 pairs 
—— —— — 
= 10.1112---9899 —0. 1011---9798 —0. 00---0099 
=pot+&2—re, 
where pz = 10 and 
89 pairs 89 pairs 89 pairs 


Ooo ieee a emmeenn. 
€2 = 0.1112---9899 —0.1011---9798 = 0.0101---0101 


and rg < 10~1"8. Using the same idea again, 


88 pairs 89 pairs 
——{, = 
99£5 = 100£) — 2 = 1.0101---01—0.0101---01 
=1- T3 


with rz; = 10~!"8. Putting everything back together yields 
99? x 10°€ — 997», — 99p2 — 1 = 9971, — 99re — 13 ; 


hence 


3 
ae =p) < 997r; + 99rg +173 < [0176 ’ 


where q = 99? x 10° and p is an integer that we do not need to calculate. 
It is easy to see that q < 10!8, which gives 

3 3 
le ~ Al Sag ~ G14 ; 
q| > qitit6/i3 ~ gi4.53- 


Prove that 
1 1 1 = 1 
a= (1+or)(1+ ga) (1 +96) = + aa) 
k=1 


is a Liouville number. (You may assume that the infinite product con- 
verges.) At some stage the inequality 


a 1 2 
II (1 + san) Ste 2(m+1)! 
k=m-+1 


may be useful (but ideally you should prove this rather than just taking 
it as given). 
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APPENDIX 1: COUNTABLE AND UNCOUNTABLE SETS 


Definition 3.6. A set S is said to be countable if there exists a one-to-one 
function from S to N, the set of natural numbers, and uncountable if it is 
not countable. 


Lemma 3.28. Any finite set is countable; the set of integers is countable. 


Lemma 3.29. Let f be a function from A to B and g a function from B to 
C. If f and g are both one-to-one, so is the composite function go f. Ifgof 
is one-to-one, then so is f. 


Corollary 3.30. If T is countable and there is a one-to-one function from 
S to T, then S is countable. 

Exercise. Give an example to show that if go f is one-to-one, g need not be 
one-to-one. 

Theorem 3.31. Jf S and T are countable, then so are S x T and SUT. If 
S is countable and RCS, then R is countable. 


Proof. Let f:S— N and g: T — N be one-to-one functions. Then 
h:SxTON, h(s,t) = 2739 


and 


k: SUTINXN, Hey = | Ost ae 


are one-to-one. If R C S, then the restriction f |r of f to R is one-to-one. 
The above constructions can easily be generalised to prove the following. 
Theorem 3.32. Suppose that the sets So,51,52,... are countable. Then 


e So XS, xX-++ xX Sp ts countable for each natural number n; 


e SoUS,US2U--- is countable. 


Exercise. Explain why we need to restrict the Cartesian product in this result 
to finitely many terms, but do not need to restrict the union in the same way. 


The second of the above properties can also be extended a little further. 


Theorem 3.33. Let T be countable and suppose that for every t € T, the set 
S, ts countable. Then 
S=\) 5, 


teT 


is a countable set. 
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Proof. Let g : T > N, and f; : S; — N for each ¢ € T, be one-to-one 
functions. For each « € S let t = t(x) be the element of T such that g(t) is 
minimal subject to the condition # € S;. Then 


h(x) = (g(t(2)), fee) (2) ) 
defines a one-to-one function h from S to N x N, and so S is countable. 


Theorem 3.34. If S is countable, then the collection of all finite sequences 
of elements of S is a countable set. 


Proof. If we write ¢ for the empty sequence, then the collection we are con- 
sidering is 
fe HIS iS" WS blest, 


and the theorem follows from earlier results. 


Theorem 3.35. (Cantor, 1874). The set of real numbers is uncountable. 


APPENDIX 2: THE MEAN VALUE THEOREM 


Theorem 3.36. The Mean Value Theorem of differential calculus. Let f be 
a real-valued function defined and continuous on the interval [a,b] C R, and 
differentiable on the interior of this interval. Then there is a real number 
c€ (a,b) such that 


APPENDIX 3: THE PRIME NUMBER THEOREM 


For any real number x, we denote by (x) the number of primes less than or 
equal to x. For example, 7(1) = 0 and 7(10) = 4 and 7(100) = 7(100.5) = 25. 
The symbol 7 is customary and is used because it is the Greek equivalent of 
“pb”, the first letter of the word “prime” — it has, of course, nothing to do with 
the trigonometric constant 7. The following result, which gives an estimate for 
a(x) in terms of elementary functions, was proved independently and more or 
less simultaneously in 1896 by Hadamard [27] and de la Vallée Poussin [65]. 
A proof is given by Hardy and Wright [29]. 


Theorem 3.37. The Prime Number Theorem. The number of primes not 
exceeding x satisfies 
7(x) 


->1l arom, 
x/log« 


where log denotes the natural (base e) logarithm. 
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We may use the Prime Number Theorem to estimate the least common 
multiple of the first n positive integers. Given a prime p and a positive integer 
n, there is a unique non-negative integer a such that p* < n < p%t!; the 
least common multiple will be the product of all these powers p*. Therefore 


Ly, = lem(1,2,...,n) = II »*< I, z=". 


pn pan 
p prime p prime 


the last equality being true because the product consists of 7(n) equal factors. 
Let c be a constant greater than e. It follows from the Prime Number Theorem 
that if n is sufficiently large then 


n(n) 


1 
n/logn See 


and hence 
io nt _ er (n) logn <c". 


In section 3.4 we chose c = 3 to keep things simple. We could have taken a 
slightly smaller value, which would have resulted in a slightly larger value for 
s; however, this would have made no significant difference to the result. 


Comment. In fact, it can be shown [55] that the maximum value of (L,)!/” 
occurs when n = 113. Therefore, if we take 


c = (Iem(1,2,...,113))'/""* = 2.8258821394.-- , 


then we have L,, < c” for all n, and not just for all sufficiently large n. 


CHAPTER 4 


Continued Fractions 


Come back tomorrow night 
...we’re gonna do fractions! 


Tom Lehrer 


] N CHAPTER 3 we were looking for rational approximations to various real 
numbers as a means of proving irrationality or transcendence. There is in 
fact a standard procedure for obtaining all of the “best” rational approxima- 
tions (in a sense to be explained later) to a given real number by the use of 
continued fractions. 

We shall start with the Euclidean algorithm for computing the greatest 
common divisor of two (positive) integers. For example, beginning with 95 
and 37, we have 


95=2x37+21, 37=1x214+16, 21=1x16+5, 16=3x5+1, 


which shows that the greatest common divisor of 95 and 37 is 1. Rewriting 
these equalities in an obvious way gives 


95 Ae pe oD sto I oga 
37 37° 217 16 16° 5 50 
and if we combine all these expressions we obtain 
95 1 
= = 24+ 
af ea 
os 
1 
3 a = 
" 5 


This is called the continued fraction expansion of 2. To save space we nor- 
mally use one of two alternative notations: 
il 1 1 1 


ag + ————7——— = 80, 41, 2,...,474| = ag + ——— a 
1 | , , : : n] ay a2 st An 
ay 


1 
ag +e++ + — 
in 
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It is easy to see that if ao,a1,a2,...,@, are positive integers, then the 
expressions just considered represent rational numbers. If we try to find a 
continued fraction for an irrational number such as 2, we might begin as 
follows. Since 1 < \/2 < 2 we have 


V¥2=1+(v2-1), 


and in order to express the “remainder” \/2 — 1 as a fraction with numerator 
1 we write 


Pes OO EE, 1 


V24+1 J2-1 


Since 2 < /2 +1 < 3 we continue 


1 1 1 
V2 =14+ —=—_ =1+ ~— ——_., 
2+ (2-1) 2+ J24+1 


and we observe that with the reappearance of the denominator /2+1 the pro- 
cess will repeat. Suppressing any qualms about infinite algebraic expressions 
we might write i , ; , 

0 ae pe Dee a yo 
though we should realise that this raises a convergence question which needs 
to be resolved. 


It’s time to introduce some terminology. 


4.1 DEFINITION AND BASIC PROPERTIES 


Definition 4.1. A finite or infinite expression of the form 
b 
ag + . 
ay + 

ag + ——— 

7 a3 + eee 
is called a continued fraction. A simple continued fraction is one in which 
every by is 1, every az is an integer, and every ax except possibly ag is posi- 
tive. For a (finite or infinite) simple continued fraction we shall also use the 

notations 


1 1 1 
ay + ag + a3 + ares 


ag + and [a0, @1, @2,43,..-] . 

A finite simple continued fraction is said to represent the number obtained 
by performing the arithmetic in the obvious way; an infinite simple continued 
fraction |ao, a1, 42, a3,...] represents the real number a if 


a= lim [ao, a1, @2,03,-.-,@n] « 
noo 
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Letk € N. The integer a,x is called the kth partial quotient of the continued 


fraction [ao,@1,@2,...], or of the number a it represents; the continued frac- 
tion a = [Ak,Qk4+1,0k+42,---] ts the kth complete quotient of a; and the 
continued fraction [ap,a1,...,@%] is the kth convergent to a. 


Note that a convergent, being defined as a finite continued fraction, is al- 
ways a rational number. Henceforth we shall blur the distinction between 
a continued fraction and the number represented by the continued frac- 
tion. We shall use such language as, for example, “the continued fraction 
a = [a0, 41, d2,...]” instead of saying more precisely, “the continued fraction 
(a0, @1,@2,...] which represents the number a”. 


Continued fractions, their convergents, partial quotients and complete quo- 
tients have many fascinating properties. 


Lemma 4.1. Basic properties of continued fractions. Let a be the finite 
simple continued fraction |[ag,a1,...,@n] or the infinite continued fraction 
[ao, @1,@2,...]. Let the kth complete quotient of a be ay. For integers k > —2 
define py and qx inductively by 


p-2=0, p-1=1, Dk = axpr-1+ Pr-2 for k > 0; 
gq-2=1, g-1=0, G& =4nGR-1+ G-2 fork > 0. 
Here and in the following we shall assume in the finite case that k < n. In 


any case a list a9, a1,...,@pr—-1 with k = 0 is taken to be the empty list. We 
have 


© w= [a0,1,...,4~—-1,%] for k > 0; 


e ifx>0 is real and k > 0, then 


LPk—-1 1 Pk-2 | 
[Q0,@1,---,@-1, 2] SS? 
Tdk-1 T Gk-2 


e ifk >0 then 

AkPk—-1 1 Pk—-2 , 

Ondk—-1 + Ok—2 ” 
© pr/dk = (ao, @1,---, 4%] ts the kth convergent to a, for k > 0; 
e ifk>-—1, then 


Pr—19 — Pegk—1 = (—1)* ; 
e pr and gz are relatively prime for any k > 0. 


Proof. The first result is clear from the definition. The second is an easy 
induction proof making use of the identity 

1 1 a 1 iL 1 1 
aj+ ag+ -+: art £ ay+ ag+ +++ pit a,++ 


9 
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and the third follows immediately. Taking 7 = a; in the second result yields 
the fourth, and the fifth is again an induction using 
PkOk+1 — Pkt idk = Pk(@k+19k + Ge-1) — (Ak+1Pk + Pe-1) Uk 
= —(Pk—19k — PkIk-1) + 


It follows from the fifth property that any common divisor of pz and gx is also 
a divisor of (—1)*, and this proves the last result. 


These properties provide a convenient method for evaluating a finite simple 
continued fraction a and all its convergents: list the partial quotients a, and 
calculate py, and gq, recursively. Then all the values of p,/q, are convergents 
to a, and the last is a itself. For example, to find the convergents of 


we construct a table 


ee ee 


Popa fs fa fs Dar [0 er 


1 4 5 2 34 63286 
1’? 8’ 4’ 28’ 27’ 50’ 227’ 
the last of which is the value of a. 


We have seen that any finite simple continued fraction represents a rational 
number. The converse is also true. 


Theorem 4.2. Any rational number is represented by a finite simple contin- 
ued fraction. 


Proof. Let a = p/q with, as usual, p,q relatively prime and q > 0. We 
shall prove the result by induction on q. First, if ¢g = 1 then the continued 
fraction is just ~w = |p|. Let q > 1 and assume that any rational number with 
denominator less than g can be expressed as a finite continued fraction. Divide 
p by q to give quotient a and remainder r: that is, 


p=aqtr with O<r<q. 


The remainder r is not zero since p and q are coprime; so by the inductive hy- 
pothesis we can write q/r as a finite simple continued fraction [bo, b1,..., 0m], 
and we have 
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This proves the inductive step, and the result follows. 


Although any rational number can be written as a continued fraction, the 
representation is not unique. For example, we have 


95 1 1 Ll 9 1 1 1 1 1 

37 (2 1. 6.5 2s toe oe o 

We may, if convenient, rule out the second possibility by insisting that the last 
partial quotient in a continued fraction be greater than 1; and there are no 
further continued fraction representations of a rational number, as is shown 
in the following result. 


Proposition 4.3. “Almost—uniqueness” of continued fractions. Suppose that 
[a0,@1,---,@m] and [bo, bi,...,bn] are finite simple continued fractions repre- 
senting the same (rational) number, where by symmetry we may assume that 
O0<m<vn. Then either 


em=n andao = 00, a, = b1,..., Gm = bm; OF 
em=n-—1 andap= bo, a, = 01,..-, Am—1 = bm_1, with am = bm +1 
and b, = 1. 
Proof. First, consider the case m = 0; then ap = [bo,b1,..., bn]. If n = 0, 


then ag = bo. Ifn = il. then ag = bo + 1/by and so bo <ao< bo + As since 
do and bo are integers, we have ag = bp + 1 and the second alternative above 
holds. If n > 1, then 


but b; +1/bg is strictly greater than 1, so bp < ao < bo +1, which is impossible. 


In the case m > 1 we write 
Q@ = [a0, @1,---,@m] = [bo, b1,---, bn] ; 
reasoning as in the case n = 1 above, we have 
ag9<a<aj+l1 and blhb<a<bot+l1. 
Now if a@ is an integer then ag +1 = a = bo +1, while if not then ap = | a} = bo; 
but in either case, ag = bo. Therefore 
[b1,---, Bn] ’ 


so [a1,---,@m] = [b1,..., bn] and the result follows by induction. 
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4.2 CONTINUED FRACTIONS OF IRRATIONAL NUMBERS 


Consider again our (presumed) continued fraction for 2. Using the tabular 
method, or otherwise, we find that the first few convergents to 2 are 


1 3 7 17 Al 99 239 577 

1’ 2’ 5’ 12’ 29’ 70’ 169’ 408 7°” 
Evaluating these convergents (and, if necessary, a few more) as decimals, it is 
not hard to convince ourselves that the convergents p2%/q2% with even indices 
form an increasing sequence converging to the limit /2 , while the convergents 
p2r+1/Gakr+1 With odd indices form a decreasing sequence which converges to 
the same limit. In fact, this observation is the key to proving that an infinite 
simple continued fraction always converges; having done so, we shall find it 
easy to confirm that the continued fraction for \/2 is as we have conjectured. 


Lemma 4.4. Oscillation of convergents. Let px/q, be the kth convergent to 


the infinite simple continued fraction a = [ao, a1, d2,...]. Then 
et el eng eee. (4.1) 
qo q2 q4 q5 93 71 


Moreover, qx, increases without limit as k > oo. 


Proof. It is obvious that po/go < pi/qi. For any k > 2, we have 
Pk _ GkPk-1 + Pk-2 , 


dk @kGk-1 + Me—-2 
since all terms involved are positive, a result from elementary algebra (see ap- 
pendix 1) shows that pz/qx lies between pr_1/qx—1 and pr—2/qr—2. Applying 
this result repeatedly proves (4.1). To prove the second part of the lemma, 
note first that go = 1 and q, = a; > 1; then for k > 2, we have 


dk = Okdk—-1 + k-2 = Mk-1 + Me-2 > Me-1 + 1, 


and so qx > oo as k > oo. 
Comment. If a = [ao,a@1,...,@,] is a finite continued fraction, then equa- 
tion (4.1) still holds provided that we omit all p;/q, with k > n. 


Theorem 4.5. Any infinite simple continued fraction converges to a limit; 
moreover, this limit is irrational. 


Proof. Let px/q, be the kth convergent to [ao, a1, @2,...]. From the lemma, 
{me Be 
qo’ 92° qa’ 


is a monotonic increasing sequence which is bounded above (for example, by 
pi/q), and therefore tends to a limit ay with a, > pox/qax for all k. Similarly 


ad 
a 93° G5" 
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decreases to a limit ay with ay < por41/qon+1 for all k. Moreover ay > ar, 
for otherwise we should be able to find convergents pox /qox arbitrarily close 
to ay and poe41/q2e+1 arbitrarily close to ay; but then poe+1/qae+1 would be 
smaller than p2,,/q2x, contradicting (4.1). Therefore, for any k, we have 


0< ay — az < Pett _ Pak _ 1 


Q2k+1 q2k G2k+192k 


However, ay — ay is independent of k, and from the lemma we know that 
1/d2r419¢2k > 0 as k + 00; so ay = ay = a, say. Thus [ao, a1,..., ax] tends 
to the limit a as k — oo, and this means by definition that the continued 
fraction represents the real number a. 


To prove the irrationality of a we use the test provided by Lemma 3.18. If 
a is rational there is a positive constant c such that 


for all convergents p2./qox, with at most one exception. But for every k, we 


have 
Pak — Pak+1 _ Pak t 


0<a- < = —$— |, 
q2k Q2k+1 q2k 92k+192k 


Combining these estimates yields q2441 < 1/c for all (except possibly one) k, 
which is not true as the denominators qo,+1 are unbounded. 


Comment. The latter part of this proof is not redundant. We showed in 
Theorem 4.2 above that a rational number has a finite continued fraction 
representation (in fact, two of them); but we did not prove that a rational has 
only finite representations. 


There is essentially only one fact we have yet to prove concerning the 
existence and uniqueness (or not) of the continued fraction for a given real 
number. 


Theorem 4.6. If a is a real irrational number, then it has a unique expan- 
sion as a simple continued fraction (and we already know that this continued 
fraction must be infinite). 


Proof. Following the procedure we used for \/2 on page 66, we set ag = a 
and then for k > 0 define recursively 
1 
ay, = |ax| and ag. = ——. (4.2) 
Ak — ak 

Then ag is an integer, a1,a2,... are positive integers and a1, Q2,... are real 
numbers exceeding 1. Since a is irrational we can readily show that every 
Q, is irrational; so a, — a, cannot vanish and the recursion never terminates. 
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Therefore, we have an infinite simple continued fraction, which by the previous 
theorem converges to a limit, say 
1 1 1 


=ado+ : 
B . ay + ag + ag+ ce: 


We have to show that a = 6. Now from (4.2) we prove by induction that 
a = [a0,Q1,.--, @k—1, A] (4.3) 
for all k, and hence by a property from Lemma 4.1, 


_ AkPk-1 + Pk—2 


Ongk—-1 + Qk—2 


Here pr/qx = [a0,01,---, 4%] is the kth convergent to 8. Note that we cannot 
say immediately that pz /qx is a convergent to a because the expression (4.3) 
is not a simple continued fraction. Using appendix 1 again, a lies between 
QEDk— a 
kPk-1 snd Pk—2 
kdk-1 Gk—2 


But as k > oo each of these fractions tends to the limit 6, and so a = £ as 
required. 

To prove uniqueness, suppose that we have two infinite simple continued 
fractions representing the same number, 


[a0, G1, G2, 3 | = [bo, by, ba, wis | o 
For any k, we can write these in the form 
[20,41,---, Qk, @e+1] = [bo, b1,.--, be, Bett] 


with ag41, Gr+i > 1, all a;,b; integers and all except possibly ao, bo positive. 
Then from the following lemma (which will be useful again later), we have 
immediately a, = by. As this is true for all k, the two continued fractions are 
identical. 


Lemma 4.7. A uniqueness property. Suppose that 0 <m <n and 
[a0, G1, Pes Qm—1, m!] — [bo, bi, tye y bn—1; Bn] F 


where ao, bo are integers; a1,...,@m—1,61,...,bn-1 are positive integers; and 
Qm;2n are real numbers strictly greater than 1. Then 


ag = bo , Qy= b1,. ey Am—-1 = Opps and Am = lms F -;0n—1; Bn] . 


Comment. This result can be expressed as follows: if two continued frac- 
tions, each ending with complete quotients greater than 1, represent the same 
number, then “as far as they are both simple” they are identical. 
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Proof. Let m > 1 and suppose that 
Q = [a0, @1,---,;@m—1, @m] = [bo, b1,---, On—1, Bn 
with the conditions of the theorem satisfied. Observe that 
[Q1,---;Q@m—1,Am] >1; 


for if m = 1 (so that there are no terms a;,) then the left-hand side is just 
Qm, Which is greater than 1 by assumption, while ifm > 1 then the left-hand 
side is the sum of a positive integer a, and a strictly positive quantity. Hence 


1 1 1 


——— <1 
ay+ +++ Am-1+ Om 


O0< 


and we have ag < a < a9 + 1. Similar reasoning holds for the other continued 
fraction, and so 


ag = la} = bo : 
Consequently 
(a1, »++5Qm-1, Qm!| = [b1, ee Enise by=1; Bn] > 


and the result will follow from the corresponding result for m—1 terms. Since 
the case m = 0 is vacuously true (almost), the result is proved by mathematical 
induction. 


Examples. Although we now know that /2 has a continued fraction, and 
that [1,2,2,2,...] converges, we still have not shown that they are equal! To 


? ? > 9 


do this, write 


1 1 1 1 1 
=14+— — — d =2 i 
EE ee Ie ie yg, Oe ea 
Then 6 = 2+ 1/8, which simplifies to 6? — 28 —1 = 0. Since is positive we 
have 6 =1+ 2; but a=1+41/ and at last we have shown properly that 


1 1 1 
a 
VON ye gas 5 on 
As a second example we evaluate a = [5,4,3,2,1,3,2,1,...], a periodic con- 
tinued fraction. Set 6 = [8,2,1,3,2,1,...]; then 
1 1 1 Lt «1 
: cg oe 2+ 14+ 8 


Evaluating the latter fraction by the tabular method 


Ta 
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gives 
f= 108+7 
— BB +2’ 
which leads to the quadratic equation 38? — 86 — 7 = 0 with positive root 
B= (4 +¥V 31) using similar means to evaluate a in terms of (@, we find that 


i. i -at & © 2 2 _ 218+5  409- 37 
4+ 34+ 24+ 14 34 24+ 14--+- 4841 #77 


In fact the same procedure will show that any (eventually) periodic continued 
fraction is a quadratic irrational. Consider 


a= [a0, @1, i -,Am—1, 00, 61, wd ., On—1, 00, 61, Wine ., On—1, 50, te | 5 


and let 
GB = [bo, b1,.--, On—1, 60, b1,---, bn—1, bo,---] - 


If we write pz /qx for the convergents to a and r;,/s, for the convergents to 2, 
we have 

a = [ao,a1 4 1, 6) = Pmt Pina 

? 3 7 m—l1) B dm—1 + dm—2 


and 
B Tn—-1 +Tn-2 


BSn-1 + Sn—2 


The second equation reduces to a quadratic with integral coefficients, so ( is 
a quadratic irrational (the quadratic cannot have rational roots since ( has 
an infinite continued fraction). Substituting the value of @ into the previous 
equation and, if desired, rationalising the denominator shows that a is also a 
quadratic irrational. We have proved the following result. 


B = (be, Biy< +25 05—15,)] = 


Theorem 4.8. An eventually periodic simple continued fraction represents a 
quadratic irrational. 


In fact, the converse of this result is true, so that a quadratic irrational 
is characterised by having an (infinite) eventually periodic continued fraction. 
We shall prove this immediately in order to complete our study of quadratic 
irrationals, even though the proof requires a result concerning approximation 
properties of convergents that we shall not prove until later. We leave it to 
the reader to confirm that our proof of Lemma 4.10 does not depend on any 
result which has not been proved up to this point. 


Theorem 4.9. The continued fraction of a quadratic irrational number is 
eventually periodic. 


Continued Fractions @ 75 


Proof. Let a be a root of the (irreducible) quadratic Az? + Bz + C, where 
A, B and C are integers with A and C not zero. Since the convergents and 
the complete quotients of a satisfy the relation 


_ AkPk-1 + Pr—2 
Qkdk—-1 + Tk—-2 
we have 
AkDk— _9\2 AkDk— = 
A( kPk—1 + Dk >) + B( kPk—-1 + Dk 2) +0 =0; 
Okdk—-1 + Tk—-2 Qkdk—-1 + Tk—-2 
once we emerge from algebraic manipulations we find that ax is a root of a 
quadratic A,z? + Bpz + Cy, where Ap, By, and Cy, are integers given by 
A, = Ap?_, + Bpr—1qde—-1 + Cap_1 
By = 2App—1pr—2 + Bpg—1de—2 + Bpp—2qp—1 + 2C Gr—19r-2 


Ch = Ap?_» + Bpp—2qdr—2 + CG . 
Now write dy = q,a — pr; by the inequality in (4.5), we have 


1 1 
<<. (4.4) 
dk+1 dk 


Pk 
“i ee 


dk 


< 


ldx| = 


Therefore 
Ag = A(qe—10 — dg_1)? + B(qe—10 — de—1)Qe-1 + Ca_1 
= (Ao* + Ba + C)q@i_, — (2A + B)dg_igp_1 + AG_; 
= Ad?_, — (2Aa+ B)dp_iqe-1 , 
and using the inequality (4.4) we obtain 
|A;| < |A| +|2Aa+B . 
Performing very similar calculations for By, and observing that 
Gk-1 | Wk—-2 


+ 
dk-1 dk 


qr—1dp—2 + qp—2dp—-1 < <2 


we find that 
|Br| < 2(|A] + |2Aa + B)) ; 

finally, 

[Ci] = |Ax—1| < |A] + |2Aa + B| . 
Hence the coefficients A;, By and C, are bounded independently of k; so they 
take only finitely many values, and the quadratics A,z? + Byz + Cy have 
altogether only finitely many roots. Therefore, we must have a, = a,+¢ for 
some k; this implies that 


ax = lox] = lonse] = ans: , 


and that the same relations must hold for all subsequent k. Hence the contin- 
ued fraction of a is eventually periodic. 
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Comments. 


e As often happens with this kind of argument, the number of possibilities 
for a, given by our estimate is far in excess of the actual number. For 
example, if a = V2 then we have A = 1, B = 0 and C = —2, which 
leads to 


O0<|Agl <3, |Be) <7 and 0<|C| <3. 


Even if we assume (as we may) that A, is positive, this gives 270 possible 
quadratics and 540 possible a;. But in fact a, takes only two different 
values! 


e An alternative argument for part of the above proof: by straightforward 
algebra, B? —4A;,C; = B? —4AC, which is a fixed constant. So once we 
have proved that A; and C, take only finitely many values, the same 
follows immediately for By. 


e Why does this not work for irrationalities of higher degree? Well, if a 
is (say) a cubic irrational, we should expect to get a formula something 
like 

Ax = (Ao? + Bo? + Ca + D)ge_, 
— (3Aa? + 2Ba+ C)q?_idy1 
+ (34a + B)qu_1d?_, — Ad3_, . 
Now as in the quadratic case, the first term here will vanish and the 
last two will be bounded. However, the second term will be, roughly, a 


constant times gx—1, which is unbounded, and this will wreck the entire 
argument. 


e In view of the theorem just proved and earlier results, we can sum up 
the connection between the continued fraction of a real number and its 
algebraic status in this way: algebraic numbers of degree 1 have finite 
continued fractions; those of degree 2 have ultimately periodic infinite 
continued fractions; and those of higher degree, as well as transcendental 
numbers, have non—periodic infinite continued fractions. 


4.3 APPROXIMATION PROPERTIES OF CONVERGENTS 


Having proved all that we need about representation of numbers by contin- 
ued fractions, we proceed to investigate what continued fractions can tell us 
about the approximation of irrationals by rationals. This will link the present 
topic with that of the previous chapter. First, some equalities and inequalities 
concerning the difference between a number and its convergents. 


Lemma 4.10. Let a = [ao, a1, a2,...] be an infinite simple continued fraction 
with convergents pr/qx and complete quotients a,. Then for k > 0 we have 
: ik 1 1 
jo-B = —————_ << < OE (4.5) 
dk (Qn419k + 4b-1)9k ~§ Wk4+19k ~ Ok419% 
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If a = [ao,@1,...,@n] ts a finite continued fraction, then the same relations 
hold forO <k<n—-1, while ifk =n—1 the first inequality must be replaced 
by equality. 
Proof. We have 

2 Pk Ok+1Pk TPk-1 — Pk Pk—-19k — PkOk—-1 


dk Ok+19k + Uk-1 Wk (an+419b + dk-1)U ” 


and the equality follows since pp_1qz — Pedxk—1 = £1. To prove the inequalities 
we need only observe that, except in the finite case when k + 1 = n, we have 


Ok+419k + Ok-1 > Ok419k + Gk—-1 = Wk+1 2 Gk419k - 


Comments. 


e Using this we can give an alternative proof of Theorem 3.21, that any 
irrational is approximable to order 2. For an irrational number a has 
infinitely many convergents p/q, and, from (4.5), every one of these 
satisfies 

2 1 


0<ja--|<—. 


q| 


e Observe also from (4.5) that px/gqx will approximate a to within c/q?, 
and that the constant c will be exceptionally small if the next partial 
quotient az+41 is large. Now consider the continued fraction for 7. By 
taking a sufficiently accurate decimal approximation to 7, or possibly 
by other methods, we obtain 


7+ 154+ 14 2924+ 14+ 14 14+ 24... 0 


So we can find two very good fractional approximations to 7 by truncat- 
ing the continued fraction just before the partial quotients 15 and 292. 
The first of these gives 7 ¥ [3,7] = =. This approximation is, of course, 
well known; perhaps less well known is just how good an approximation 
it is. The discrepancy is 


| 22 1 1 


San eo Sar 
i 15x 7? 735 
to obtain similar accuracy from the decimal expansion we would need 
to take two decimal places, and this would mean, in effect, to approx- 
imate 7 by —. a much more complicated fraction than 2. The other 
approximation cited above gives 
1 1 1 355 


Te 3+ >— 


eae a 
7+ 15+ 1 113 113 292 x 1132 © 


This estimate was known to the Chinese mathematician Zu Chongzhi 


(A.D. 429-501); it is accurate to six decimal places, and is a closer ap- 


: : + 3141593 
proximation than the fraction 7g900- 
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We shall use the inequalities of the preceding lemma to look more closely at 
the way in which numbers are approximated by their convergents, and thereby 
to see how the continued fraction of a real number a is related to its order of 
approximation by rationals. Since we already know (pages 46-47) everything 
there is to know about approximability properties of rationals, we shall gen- 
erally assume that a is irrational, and shall restrict ourselves to occasional 
comments on the rational case. Many of the properties we shall prove do in 
fact also apply to this case, with minor modifications sometimes being neces- 
sary. First, we show that the discrepancy between a and a convergent pr /dk 
always decreases as k increases. 


Lemma 4.11. Let a = [ao, a1, @2,...] be an infinite simple continued fraction 
with convergents pr/qx and complete quotients a,. Then for k > 1, we have 


Pk 
ea 


dk 


Pk-1 


lana — Del < |qxr—-1@ — Pri] and 
dk-1 


<|o- 


Proof. We use the equality in (4.5). The first inequality to be proved is 
equivalent to 
Ok+1dk + Ok-1 > Wkdk—-1 + Ik—2 5 


substituting gp.—-2 = dp — @pqx—1 and rearranging shows that we need to prove 


(Qn41 — Lge > (Ax — ak — Lgn-1 - 


But for an infinite continued fraction it is always true that az4,; > 1, and 
hence that a, — ax, = 1/az41 < 1; therefore 


(Qrn41 — Lge > 0 > (ax — ax —l)ge-1 , 


and the first part of the proof is complete. The second inequality follows 
immediately since qx > qr—1- 

Comment. If a = [a0,a1,...,@n] € Q and if a, > 1, then the inequalities 
just proved hold for 1 <k <n. Ifa, =1 and k =n-—1 the first inequality is 
false, and must be replaced by an equality. However, we may ignore this case, 
since, as pointed out on page 69, a rational number can (nearly) always be 
written as a continued fraction whose last partial quotient is strictly greater 
than 1. 


The following result shows that the “best” rational approximations to an 
irrational number are its convergents, as foreshadowed at the beginning of this 
chapter. 


Theorem 4.12. Convergents are best approximations. Let a be a real irra- 
tional with convergents pr / qr. 


e If p/q is a rational number (with, as usual, positive denominator) such 
that 

Pk 

oa 


- (4.6) 


with kk > 1, then q > dk. 
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e If p and q are integers, q > 0, such that 
|qa — p| < |qra — pr (4.7) 
with k > 0, then q > qr+1- 


Proof. We shall prove the second part of the theorem first, and then show 
that the first part easily follows. So, suppose that a is irrational, that k > 0 
and that |ga — p| < |qxa — pr|; consider the system of linear equations 


fe + Proiy =P 
Gr® + Up4iy =4 - 


The determinant of the coefficients on the left-hand side is prgqx+1 — Pr+idk; 
which is +1, and so the system has a solution in integers x, y (see the second 
appendix to this chapter). Recall that q,q, and qx+1 are all positive; from 
the second equation above it is therefore clear that x and y cannot both be 
negative, while if both are positive then g > qx+i1y > dp+1 and the result is 
proved. So we may assume that one of x, y is positive (or zero), and the other 
is negative (or zero). It follows from the proof of Theorem 4.5 that one of 


Pk+1 
a 
qk dk+1 


is positive, one negative; and so the same is true for the expressions q,a — pr 
and qp41@ — pr+i1- Therefore 


t(qra—pr) and y(qrs+1@— pry) 


have the same sign, and hence 


|x| |axo — pel + lyl l@x+1@ — Pegal = |2(qea — pe) + Y(dk41@ — Pr+i)| - 


But the right-hand side is just |ga — p|, and so 
|x| |gxa — pel < |ga— pl < |axa — pal - 


This can be true only if c = 0, so gq = qp+iy > Qe+1 and the second result 
stated above is proved. 


To prove the first result, suppose that the inequality (4.6) holds for some 
k > 1 but that ¢ < gq. Multiplying these inequalities (permissible as all the 
quantities involved are positive) we obtain |ga — p| < |qxa — px|. From the 
result already proved, we have gq > qrii1, and so ge41 < ge, which is not 
possible for k > 1. 


Comments. 


e The first result shows that if p/q is a better approximation to a than 
pr/dr, then the former fraction must have a larger denominator than 
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the latter; in other words, of all fractions with denominators up to qx, 
the convergent px /qx is the closest to a. In fact, if we wish to obtain a 
better approximation than p;/q, in the stronger sense (4.7), we have to 
go at least as far as the next convergent. 


e The condition k > 1 in the first result is necessary because go = q, for 
some continued fractions a. In fact, if n + $ <a<n+1 then we have 
n 
a=n+— Sid: Shee 
1 + aw do 1 
However, if p/g = (n+1)/1 then the inequality (4.6) is satisfied but the 
conclusion q > qo is false. 


e An interesting alternative approach to continued fractions is to define 
the convergents to a real number a as the best rational approximations 
to a, and then derive all of the important properties of convergents. 
Specifically, we let po = |a| and gq = 1, and then define, for each 
k > 0, the convergent pz+1/qz+1 to be the fraction with smallest possible 
denominator such that 


ldk+10 — Proil <|dka— ppl - 


We can then show that q,_1 is a factor of q,—qz—2, and define the partial 
quotients of a by ax = (qk — de—2)/qx—1- For more on this approach see 
Cassels [17]. 


The next result follows easily from that above. 


Theorem 4.13. “Best approximations are convergents”. Let a be irrational. 


If 


then p/q is a convergent to a. 


Comment. Conversely, it can be shown that of any two consecutive conver- 
gents to a, at least one satisfies the above inequality. We leave this as an 
exercise. 
Proof. Suppose that the given inequality holds. Since gq, + oo as k > ov, 
there exists k > 0 such that qx, < q < qx4i1. From the previous theorem we 
have 

|qe@ — pe| < |ga — p| 
because q < gz41, and hence 


d+ 4k 
lpax — Pka| < Gane — pel + ae|ga — p| < (a+ Gx) |qa — pl < Ta 2s 


Since the left-hand side is an integer it must be zero, and p/q = pg/qx, which 
is a convergent to a. 
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4.4 TWO IMPORTANT APPROXIMATION PROBLEMS 


The above theorem shows, roughly speaking, that convergents are not merely 
good approximations but in a sense the only possible good approximations 
to a real number. We digress to look at two important problems of rational 
approximation which may be solved using continued fractions. 


4.4.1 How many days should we count in a calendar year? 


This problem was addressed by Euler in [25]. The difficulty is that for conve- 
nience of use, the calendar really should contain an integral number of days 
per year, whereas the actual length of a solar year (that is, the period from 
one northern spring equinox to the next) can be measured as 365 days, 5 
hours, 48 minutes and 46 seconds. If a year were to contain an exact number 
of days, and always the same number, the calendar would not keep pace with 
the seasons; students who like to finish their exams around the beginning of 
December and then head for the beach would at some (not very distant!) date 
find the weather in December rather unsuitable for this. The main impetus 
for the creation of the modern calendar came in the medizeval period, when 
it was realised that accumulated errors in the calendar would eventually lead 
to Easter, traditionally a spring festival in the northern hemisphere, being 
celebrated in midwinter. 


The solution to the calendar problem is, in outline, simple and well known: 
we adopt 365 days as the standard length of a year, and decree that certain 
leap years shall be allocated one extra day. The difficulty lies in the details. 
How shall we determine precisely which years are to be leap years? 


Suppose that in a cycle of gq years we add an extra day in each of p years. 
Then the average length of a calendar year will be 365 + p/q days, and we 
would like this to equal the observed length of a solar year. Converting the 
above data into a rational number of days, and deducting 365, we want 


p _ 10463 


q 43200 ° 
While we could obtain an exact fit to the observations by choosing 10463 
years in every 432 centuries as leap years, and then repeating the pattern, it 
is clear that such a scheme would be too cumbersome for practical use. What 
we need, therefore, is a good approximation to p/q having a much smaller 
denominator; and as we have seen, this is a question which can be answered 
by examining the convergents of p/g. We may compute the continued fraction 


104631 1 1 1 1 oil 


1Note for northern hemisphere readers: this was written in Australia, where the aca- 
demic teaching year traditionally runs from March to November. The summer months are 
December, January and February. 
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from which we find the convergents 


1 7 8 BL 163 1046s 
4° 29° 337 128’ 673 43200 ° 


The first of these convergents suggests that we make every fourth year a leap 
year. This very simple rule constitutes the Julian calendar, so called because it 
was instituted by Julius Caesar, on the advice of the Alexandrian astronomer 
Sosigenes, in about 45 Bc. According to Sosigenes’ scheme, four calendar years 
would be about 45 minutes longer than four solar years, and so, for example, 
midsummer’s day would occur slightly earlier (according to the calendar) every 
four years. The dates of midsummer and midwinter would be interchanged 
after about 23000 years; while this prospect would not appear close enough 
to worry anyone, smaller but still significant discrepancies became noticeable 
within a few centuries, and led to further reforms of the calendar. 


We shall ignore the second convergent <5; 


59; it is not much simpler than 
the following one, =, and it is not an exceptionally good approximation any- 
way. We recall that this is because the third partial quotient here is not very 
large. The third convergent can be used to improve on the Julian calendar. 
It prompts us to declare 8 leap years in 33 years, or 24 in 99 years. Let us 
adjust this to 24 leap years in 100 years. The approximation will then be less 
good, but it will be very easy to implement. We need only decree that every 
fourth year shall be a leap year, with the exception of one year per century, 
say the century year itself, which shall be an ordinary 365-day year. Using 
this calendar it would take over 80000 years for the dates of midsummer and 
midwinter to be interchanged. 


The Julian calendar was in use throughout Europe for 1600 years or so, 
by which time the difference between the calendar year and the solar year 
was causing perceptible problems. The difficulty was not specifically with the 
alteration of the seasons but with calculating the date of Easter. This date 
depends on the first occurrence of the full moon after the (northern) spring 
equinox. Since the Julian calendar year is longer than the true solar year, the 
assumed date, March 21, for the equinox will bit by bit arrive later than the 
true date. Even an error of a day or two — and by the 1500s the accumulated 
error was about ten days — could cause the first full moon after the equinox to 
be missed, and a “wrong” full moon to be chosen instead. As a consequence 
Easter could be celebrated a month or more later than it should have been, 
and in the late sixteenth century Pope Gregory XIII instituted a commission 
to advise on how to correct the accumulated errors of 1600 years and minimise 
future errors. The scheme proposed by this commission, adopted in 1582 in 
most European Catholic countries and nowadays standard throughout the 
world, is known as the Gregorian calendar. 


Consider the fourth convergent oe to our required ratio. In itself this 


would suggest a 128-year cycle of ordinary and leap years, which probably 
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would not be convenient to use. But we might notice that 


31 96f = 97 
128 400 400° 


To employ a scheme based on this approximation we begin by decreeing a 
leap year every four years, but then omit three of these leap years — that is, 
restore them to 365 days — every 400 years. The precise method in use today 
is that every year divisible by 4 is a leap year, except that a year divisible by 
100 is not a leap year, except that a year divisible by 400 once again is a leap 
year. The average length of a year under the Gregorian calendar is 365 days, 5 
hours, 49 minutes and 12 seconds, just 26 seconds longer than the solar year. 


There is another way to derive this approximation from the convergents 
to p/q. If pr-1/de—1 and pr/qe are adjacent convergents to any number a, 
consider the fractions 


Pk-1  Pk-1 + Pro Pr-1 + 2DR Pr-1 + 3DK Dk—-1 + Ak+1Pk 
Qke—-1 > Me-1 + Qe) Gk—-1 + 2Gk > Ge-1 + 3qK 7 Qe—-1 + OK419K 


The last of these is the convergent p41 /qk+1; the first is obviously a conver- 
gent; the others are known as secondary convergents. Roughly speaking, the 
secondary convergents to a are, after the convergents, the next best rational 


approximations to a. Here one of the secondary convergents to fees is 


31+ 163 194 194 97 


128+673 801 800 400’ 


from which we obtain again the Gregorian approximation. 


Many documents from Pope Gregory’s scholars still exist, and it is clear 
that they did not in fact use continued fractions to derive their calendar. 
According to N.M. Beskin [13], the commission took the length of the solar 
year to be 365 days, 5 hours, 49 minutes and 16 seconds; this value was given 
in the Alfonsine tables, a compendium of astronomical data put together in 
the thirteenth century for King Alfonso X of Castile. Using this figure they 
deduced that the Julian year was too long by 10 minutes and 44 seconds, and 
so would be one day late every 134 years. Therefore, the committee reasoned 
that one leap year should be omitted every 134 years, which is about three 
every 400 years. 


So, regrettably, we are forced to admit that mathematical hindsight does 
not entail historical accuracy. Nonetheless, we can argue that the theory of 
continued fractions is what makes a particular approximation worthwhile, 
regardless of the methods actually used to obtain it. Even though the as- 
tronomers and mathematicians appointed by Gregory did not use continued 
fractions to obtain their results, they could not have found a good calendar 
which was entirely inconsistent with this topic. 
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4.4.2 How many semitones should there be in an octave? 


In musical theory, the interval of an octave contains twelve semitones. Mu- 
sically inclined mathematicians (or mathematically talented musicians) may 
have wondered if there is anything special about the number twelve. Could 
one work with a musical system of, say, eleven, thirteen or forty-one semitones 
to the octave? In this section we shall use continued fractions to see that there 
are very good reasons for having twelve notes in an octave. For readers who 
may be unfamiliar with basic musical terminology, a brief summary is given 
in appendix 4 at the end of this chapter. 


There are coherent acoustical reasons for asserting that a combination of 
two musical notes at different pitches will be pleasing to the ear if the ratio of 
their frequencies is a “simple” rational number. The simplest ratios are 2 and 
3: in musical terminology these correspond to the intervals of the octave and 
the (perfect) fifth respectively. Suppose that we take a fixed note as the basis 
of a tonal system, and build upon this foundation two sequences of intervals, 
one consisting of fifths and the other of octaves. In order to obtain a coherent 
system of finitely many notes rather than an infinite mess, we require these 
two sequences to meet again at some point. Suppose, then, that p perfect fifths 
exactly equal q octaves; in terms of frequencies, we have 


Unfortunately, as is easily proved, this equation has no solutions in integers 
except for p = q = 0, which is musically trivial. So we shall once again employ 
continued fractions to find the best possible approximate solutions. Rewriting 
the above equation to find the desired (but unachievable) value of p/q, and 
then computing its continued fraction, we obtain 


= 14+— — — — — — —__—_____, 
log 2 1+ 2+ 2+ 34 14+ 5+ 2+ 23+ 2+ --- 


The first two of these suggest that we should take a fifth, or two fifths, 
to be the same as an octave: both of these approximations are far too crude 
to give satisfactory musical results. The third convergent suggests that we ap- 
proximate five fifths by three octaves: that is, the note five fifths above our 
fundamental pitch should be replaced by that three octaves above the funda- 
mental. Choosing an arbitrary fundamental for purposes of illustration, our 
tonal system would then consist of just five notes, where notes differing by an 
octave are regarded as “the same”. These notes, shown in figure 4.1, comprise 
a pentatonic system, and can be used to perform quite satisfying, if perhaps 
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Figure 4.1 Notes of a pentatonic scale. 


Figure 4.2 Notes of the chromatic scale. 


extremely simple, music. Indeed, folk music from many parts of the world is 
found to be based on pentatonic scales. 


If we want a system containing more notes, and therefore offering wider 
musical possibilities, we could turn to the next convergent and take twelve 
fifths to equal seven octaves. Our (modified) sequence of fifths could look like 
that displayed in figure 4.2. Transposing and reordering these notes gives the 
twelve-note chromatic scale which has been the basis of most Western music 
for many centuries. 


We see, then, that the approximation of irrational numbers by rationals, 
employing as its principal tool the convergents to a continued fraction, sug- 
gests a reason why there should be twelve “different” notes used in musical 
composition, or to put it another way, why there should be twelve semitones 
in an octave rather than eleven or thirteen. If we seek an even better ac- 
commodation of fifths and octaves we may look further along the sequence 
of convergents: the next would give us a 41—note scale, and if we ignore the 
limitations of human performance we can make the ratio of fifths to octaves 
as accurate as we wish. 


Despite the beauty of the mathematics involved in this problem, we must 
again beware of using it as a substitute for history. There is no reason to believe 
that the chromatic scale was designed with continued fractions in mind. On 
the contrary, it seems clear that our present tonal system was not “designed” 
at all but simply developed in accordance with the needs of composers and 
performers. As in the calendar problem, however, we are very likely justified in 
asserting that this development is a manifestation of properties of continued 
fractions. The laws of nature hold in spite of our ignorance of them. 
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4.5 A“COMPUTATIONAL’ TEST FOR RATIONALITY 


Continued fractions can sometimes be used to give us an idea (though not 
necessarily a proof) that a certain number, presented as an infinite decimal, 
may be rational. For example, in connection with Apéry’s irrationality proof 
for ¢(3) it was conjectured, see [66], that ¢(4) can be written as a sum 


ae 
()=¢)} am: 
n=1 nA ( n ) 
with c € Q. Since it is known that ¢(4) = 1*/90, the claim is, in effect, that 


ee 
eon / 
cC=T7 90 > Ton 
n=1 nA ( n ) 
is rational. Evaluating c to 10 significant figures (which can be done by taking 
the first 10 terms of the sum) and then calculating its continued fraction gives 
1 1 1 
= 2.117647059 = 2 + —— — —____ . 
° + By D+ 196078424 --- 
Now the partial quotients in the continued fraction of a “sensible” real number 
generally consist of fairly small integers. In fact it can be shown (see, for 
example, Khinchin [35], section 16) that for a “randomly chosen” real number, 


a proportion about 
; (: + */ a+ =) 
fe) 
= a at+l1 


of the partial quotients should be equal to a given positive integer a; doing 
the appropriate calculations, we find that about 42% of the partial quotients 
should be 1, about 17% should be 2, and so on. One suspects, then, that 
the partial quotient 19607842 is due to numerical inaccuracy, and that the 
continued fraction should have terminated at the previous partial quotient. 
Therefore, it seems reasonable to believe that 


i 
S42 °° 17" 


c=2+ 


Obviously this does not constitute a rigorous proof, but in fact, it is possible 


to prove that c has the value 3. as conjectured. 


Another example of this technique arose in the present author’s investiga- 
tions which ultimately led to the paper [6]. I sought to evaluate generalised 
continued fractions of the form 

l+s 
fe)= 348 _ 
~ 34s 


2 
ee ag 
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for positive integers s. Computing decimal approximations of these expressions 
for a few values of s, and then writing them as simple continued fractions gave 
results such as 


(4) = 1.863013699 


4 1 1 1 1 1 
> 1+ 64+ 34+ 34 507356+ --- 


f (5) = 2.085828343 
i. t¢-. 2,2 2 4-8 1 


i il a 
Toe He 1+ tT. 1f 6 14 14 tee ws 


As explained above, these calculations led to the conjectures 


1 1 1 1 136 


4) =1 1 _ 136 
M)=1+ 7-45 3p 3-7 


and 


24+ 114+ 14+ 14 14+ 64 14+ 1 °° 501 ° 


Moreover, similar outcomes were observed for all tested values of s, and so it 
was conjectured that f(s) is always rational when s is a positive integer. After 
a good deal of further work it was found possible to prove this assertion, and 
to confirm the specific values conjectured above. An informal account of the 
investigation is given in [5]. 


4.6 FURTHER APPROXIMATION PROPERTIES OF CONVER- 
GENTS 


We know that for every irrational a the inequality 


has infinitely many solutions, and that for certain a the right-hand side can be 
decreased by substituting for q? a higher power q°. Indeed, if a is a Liouville 
number then we can choose arbitrarily large s and still find infinitely many 
solutions; on the other hand, we know from Liouville’s Theorem, page 49, that 
if a is a quadratic irrational then the exponent 2 cannot be increased at all. 


Thus, if we want a result which is true for all a, we cannot decrease the 
right-hand side of the above inequality by increasing s. Perhaps, however, we 
could replace the 1 in the numerator by something smaller. Indeed, we can, 
for the comment following Theorem 4.13 shows that 1 can be replaced by 4. 
Can we do even better than this? 
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Recall that convergents give “the best” approximations to a real number 
a, and that the approximations are especially good when the next partial 
quotient is large. Consider what happens if the “next partial quotient” of a is 
never large. An extreme example of such a number is 


: “a. af 14+ 75 


=i 
ee rd le. Iie he 2 


In this case it is plain that every complete quotient az is equal to a. Moreover, 
if we calculate the first few convergents to @ it is very easy to conjecture, 
and equally easy to prove by induction, that qx = pr—1 for k > —1. (It 
is also easy to show, though unimportant at present, that the numerators 
px and denominators g, are just the Fibonacci numbers.) Therefore, from 
equation (4.5), we have 


1 1 1 1 


(age + de-1)9e (2+ qr-1/ax)Q2 (atat@ 5g 


| Dk 
Qa 
qk 


Since we do not expect that any real irrational number will have worse rational 
approximations than a, the following result is plausible. 


Theorem 4.14. (Hurwitz). Let a be a real irrational 
number. Then the inequality 


p 1 
a-=+|<—_—~ 
| V5 @? 


qd 
has infinitely many rational solutions p/q. 


Proof. Write Gk—-1 Adolf Hurwitz 


(1859-1919) 
qk 


for k > 0. The idea of the proof is to show that if the required inequality 
fails for three consecutive convergents p/Qk, Pk+1/Gke+1 and pr42/qn+2 to a, 
then the two consecutive quotients rz41 and rx4g are both strictly greater 
than $(V5 — 1), and a contradiction follows. To begin the proof, we use the 
equality in (4.5) to see that 


Th = 


a— a < Fa ifand only if agai +rp > V5. (4.8) 
Suppose that in fact 
Orsi trp, < V5 and Ong. tree < V5, (4.9) 
and observe that we have 
Akt = App. + and = Arti tr , (4.10) 


Ak4+2 Tr41 
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where the second identity follows immediately from the defining recursion for 
de+1- Eliminating a,4, we obtain 
1 1 


+ 
QAk4+2 Tk+1 


= Qk41 +k , 


and by using the assumed inequalities (4.9), we have 


1 1 
tr! © <VJ5. 
V5 — Tk4+1 Tk+1 


This inequality is easily simplified to give 


rei —Vorei t1<0, 


and since the roots of the quadratic 2? — /5a +1 are $(/5 + 1) we obtain 
Thi = £(V/5 — 1). But rz41 is rational, so equality cannot hold and we have 


J5—-1 
aoe 


Tk41 > 


If we now suppose further that ag43 + repo < V5, then exactly the same 
argument gives 
V5—-1 


9 ? 
but using the second identity from (4.10), with k replaced by k + 1, we have 


J5—-1\ V5—1 
9 ) 2 > 3 


which is absurd. To sum up, we have shown that a contradiction follows from 
the assumption that the inequalities (4.8) are false for three consecutive inte- 
gers k; consequently, the required inequality is true for at least one in every 
three convergents px /qz, and the theorem is proved. 


Tk42 > 


1 = (@e4o + reqi)repe > (1 + 


Theorem 4.15. Hurwitz’ constant is best possible. The constant //5 in 
Hurwitz’ Theorem is the best possible. That is, if A is a constant greater than 
V5, then there exist irrational numbers a such that the inequality 


oo (4.11) 


has only finitely many rational solutions. 


Proof. We shall show that a = $(1 + V5) is such a number — no surprise, in 
view of the comments introducing Hurwitz’ Theorem. Let A > V5. For any 
solution p/q of (4.11), we have 


2 < — 
oS a Pa ian) 
q| V5q@ — 2¢@? 
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and so by a previous result p/q must be a convergent pz/q, to a. Using a 
result asserted on page 88 (proof: see exercise 4.2), we have 


p | 1 1 


SS 4 
q (Qn41dk + Ge—-1)de (a+ qe—1/PR-1)9q? 
and therefore P 
Az2gea =, 
Pk-1 


It follows that (4.11) has only a finite number of solutions. For otherwise this 
inequality would hold for arbitrarily large k and we would find 
a 1 
A< lim (a+ 4") =a+—=V5, 
k-400 Pk-1 a 
contrary to assumption. This completes the proof. 
Comments. 


e An alternative proof of this result has echoes of Liouville’s proof in 
Chapter 3. Let 
1+V75 


2 
Then the minimal polynomial of a is f(x) = a* — a — 1, and for any 
rational number p/q we have by the Mean Value Theorem 


f(a) = f(p/@) 
a—p/q 
for some y between a and p/g. However, f(a) = 0, and 


P\| |p pee |. 1 
=| — aS 
q q q 


2 


= f(y) =2y7-1 


IV 


and 


|2y — 1| = |(2a — 1) + 2(y-a)| < V5 +2 a= 


? 


putting all this information together yields 
1 
= = (va+2 a— 4) jo-4 . 
qd qd qd 


All of this is true for any p/q; if p/q is a solution of (4.11), then 


A<vV54+2 


p 2 
a B| < Vb aa 


Now suppose that A > /5. Then 


2- 2 
2 Adee) 
so there are only finitely many possibilities for g, and (4.11) can have 
only finitely many rational solutions. 
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e In the proof of Theorem 4.15 we considered the number a = $(1 +5) 
and made use of the fact that every complete quotient of a is equal to 
a itself. In fact, it would have been sufficient to consider a number (@ 
having infinitely many complete quotients equal to a. A little thought 
shows that this is true if and only if 

1 1 1 1 1 1 


EOE ate Geer ee ee Te oe 


for some integers bo, b1,...,5n; in other words, if the partial quotients 
of 6 are all 1 from some point on. It can be shown that if a is such a 
number then for A > 5 the inequality (4.11) has only a finite number 
of solutions. On the other hand, if we exclude from consideration all 
such a then the result can be improved, and we find that 

p 1 
a-=|<—~- 

i V8 ¢? 
for infinitely many rationals p/q. If even more a are ignored, then the 
constant V8 can be increased still further. In fact, Hurwitz’ Theorem 
is the first of a series of results which show that if a has a continued 


fraction not in a certain finite number of categories, then there is a 
constant C' such that 


p 1 
y= 5 
q|  Cq@ 
has infinitely many solutions; the constants C' can be explicitly identified, 
and can be shown to be best possible, in the sense of Theorem 4.15. 


These constants make up a (countable) set 


(3 


which forms part of what is known as the Lagrange spectrum. More 
information (particularly on where the sequence 1, 2,5, 13,29, 34,... 
comes from) may be found in Bombieri [16]. 


< 


m= 1,25,18.28,94,..} 


We can make use of properties of continued fractions to show, as asserted in 
Chapter 3, that there exist transcendental numbers which are not Liouville 
numbers. 


Theorem 4.16. Any Liouville number has a continued fraction with un- 
bounded partial quotients. 


Proof. Suppose that a has partial quotients a, < A for k > 1. Now if a is 
rational it is certainly not a Liouville number; suppose that a is irrational. If 
qa is approximable to order s > 2, then there is a constant c such that 


0< (4.12) 


Cc 
¢ 
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for rationals p/q with arbitrarily large g. Choose q sufficiently large that 
q°? > (A+ 2)c. Then certainly g°~? > 2c, we have 


O< 


and by a previous result p/q is one of the convergents to a. For p/q = p/n 
we have 
1 1 " 1 

(A+ 2)q? ’ 


P 
a 


= oD EU 
q | (Qe419% + Ge-1)Gr ~ (Get + 2)q2 


and since g°~? > (A+2)c this contradicts (4.12). Hence a is not approximable 
to order greater than 2, which proves considerably more than we claimed. 


Corollary 4.17. Not all transcendental numbers are Liouville numbers. 


Proof. Consider all possible continued fractions 
1 1 1 

ay+ ag+ a3 + cine? 
where each a, is either 1 or 2. It is a standard result of set theory that the 
set of all such continued fractions is uncountable; but the set of algebraic 
numbers is countable. So there exist (uncountably many!) continued fractions 
of the form (4.13) which are transcendental. But by the above theorem, none 
of these continued fractions is a Liouville number. 


ado + 


(4.13) 


From these results we know that certain transcendental numbers have 
continued fractions with unbounded partial quotients while others have con- 
tinued fractions with bounded partial quotients (see exercise 4.10 for a specific 
example of the latter). What happens if we ask similar questions regarding 
algebraic numbers? We already know that the continued fraction expansions 
for rationals and for quadratic irrationals are finite and periodic respectively, 
and hence clearly have bounded partial quotients, so we need only consider 
algebraic numbers of degree higher than 2. 


e Do there exist algebraic numbers of degree n > 3 whose continued frac- 
tions have bounded partial quotients? 


e Do there exist algebraic numbers of degree n > 3 whose continued frac- 
tions have unbounded partial quotients? 


Clearly one, perhaps both, of these problems must have an affirmative answer; 
unfortunately, each remains unsolved! There is not a single algebraic number 
of degree 3 or more whose complete continued fraction is known (though most 
experts in the field believe that such a number will always have unbounded 
partial quotients). Indeed, there appears to be very little of any great gen- 
erality that can be said about continued fractions for algebraic numbers of 
degree 3 or more, or for transcendental numbers. We can, however, give some 
useful methods of computation, as well as some results for specific important 
numbers. 
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4.7 COMPUTING THE CONTINUED FRACTION OF AN ALGE- 
BRAIC IRRATIONAL 


Let a be a root of a known irreducible polynomial f with degree n > 2 and 
integral coefficients. We shall assume that a > 1, and that f has no other 
roots 8 > 1. In this case there is a very simple algorithm to find the zeroth 
partial quotient a = |a|: calculate f(1), f(2), f(3),... until a change of sign 
occurs; then ag is the last argument before the change of sign. Since a is 
irrational we have ag < a < ag +1. The first complete quotient a; is defined 


by a = a9 + 1/a; therefore 
1 
F(a + +) =0 ’ 
ay 


and qa, is a root of the polynomial defined by 


fi(z) =2"f (« + *) : 


Since, by assumption, f has a unique real root greater than 1, it is not hard 
to show that f; has the same property. For let 8 = a; = 1/(a— ag). Then 


f(A) = "f(a + 5) =p" f(a) =0, 


and 8 > 1 since 0 < a—ag < 1; so fy has a real root greater than 1. 
Conversely, if 3 is any such root of f1, then ap + 1/8 is a root of f and hence 
ago + 1/8 =a. Because f; is a polynomial having integral coefficients and a 
unique real root a, > 1, the procedure can be iterated to find the sequence of 
partial quotients of a. Observe that the complete quotients a = a9, Q1,Q2,... 
need never be calculated, so we do not have the problem of calculating decimal 
expansions to many places: all our calculations will be performed in terms of 
integer arithmetic, and so the process will be free of rounding errors. 


Example. We can find the continued fraction for W/2 by starting with the 
polynomial f(z) = fo(z) = 2° — 2. We have 
fo(z) = 227-2, a=1; 
fitz) = —2° +327 +3z+1, a,=3; 
fo(z) = 102" — 627 — 62-1, ag=1; 
fa(z) = —32° +1227+2427+10, a3 =5; 
fa(2) = 552" — 81z7—332-3, as=1; 
fo(z) = —62z2 — 3027 +842+55, as=1; 
fe(z) = 472? — 16227 — 216z— 62 , ag = 


and so 


3+ 14+ 54+ 14+ 14 44+.---¢ 
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Alternatively, we can find the continued fraction for a = W/2 by adapting 
the method used for quadratic irrationals. However, a has degree 3 and we 
shall therefore, in general, have to find reciprocals of sums of three terms. 
Writing w = e?7*/3, we have 


1 - (a + bw /2 + cw? W4) (a + bw? 2 + cwW/4) 
at+biv2+4+c/4 a® + 2b° + 4c? — 6abc 
a? — 2be 2c? — ab 
a® + 2b3 + 4c3 —6abe a® + 268 + 4c? — Gabe 
b? — ac 3 
eer ce | 
r a8 + 2b3 + 4c3 — oe , 
and clearly the algebra is going to be significantly harder than it is in the 

quadratic case. The polynomial method involves much less work. 


Comment. In the above discussion we assumed that fo has no real root 
B > 1 except for 6 =a. If this is not true we may apply the same algorithm 
anyway, and obtain an appropriate sequence of functions f;,. The only possible 
additional difficulty lies in determining the partial quotients a, = | a, |: it may 
be that f; has more than one root greater than 1, and we need to ensure that 
we choose the correct one as a;. However, it can be shown that the assumed 
condition must be true, if not for fo, then after a certain number of iterations, 
and once this happens the algorithm proceeds with little difficulty. 


Example. Find the first few terms of the continued fraction of a, the smallest 
positive root of 

f(z) = 223 — 2027 + 62z-61. 
Solution. Being careful not to miss a root, we find that f has two roots 
between 2 and 3, and one between 5 and 6. So ap = 2 and 


1 
f(z) =29f(2+ =) = —23 +62? -—824+2. 
z 


Now f; has roots between 0 and 1, which is of no interest to us, between 1 
and 2, and between 4 and 5. The smallest root of f corresponds to the largest 
root of f,; therefore a; = 4 and 


folz) = 28 (44-) = 22° — 827 -6z-1. 
z 


We find that fo has only one root greater than 1 and so the procedure is 
standard from now on; we obtain 


4+ 4+ 14+ 14+ 1+ 149+ 34+ .---¢ 


Another point worth noting is that the procedure which locates a root of 
a polynomial by searching for a sign change will fail for a double root. In this 
case, however, we chose our original polynomial f to be irreducible, and it 
follows from exercise 3.11 that f cannot have a double root. 
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4.8 THE CONTINUED FRACTION OF e 


We shall determine the continued fractions for a class of numbers related to 
the exponential constant e. To do so, we first consider the functions defined 
by the infinite series 


~ il #” 
Ho)= > Serer RaD) A 


here the parameter c is any real number except 0,—1,—2,..., and it is easy 
to show that the series converges for all z. To simplify the notation we write 
ce) = c(e+1)++-(e +k—1), with the understanding that c) = 1; thus 


The expression c(*) is referred to as “c rising factorial k”. It satisfies the two 
important recurrences 


cD = (6 4+k) =cfe+1)™ , (4.14) 
both of which are instances of the more general relation c&+™ = c™) (e+k)\™, 


Lemma 4.18. Let c be a positive real number, z a non-zero real number and 
k a non-negative integer; then 


@ Fee) c ctl et+tk-1ct+k_  f(c+k;z?) 
z f(e+1;22) Zz 


3 : me: | 2) 


zz z @ fiete+ 2’) 


Proof. First, observe that under the stated conditions f(c+k+1; 27) is given 
by a series of positive terms, so it does not vanish and the last term in the 
continued fraction makes sense. From (4.14) the rising factorials satisfy 


1 c+k 1 k 


cA) (RFT) (c+ 1)(*) bi clk+1) ? 


and hence 
loc) 


pees ih gk = kk gk 
(ea) =>. (c+ 1)®) Ee > Dy aD el 
k=0 0 


The first series on the right-hand side is evidently f(c+1; z). The second may 
be written 


foe) [oe) gktl gk 


il [oe} 
ce) (k— 1)! ~ aem _— i / ee De GDH sy Gl 


k=1 = ex | 


and we have the second-order recurrence 


f(G2) =flet 12) + 


e+ 1) f(c+2;z). 
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Rearranging this equation and replacing z by 2? gives an identity which looks 
something like the beginning of a continued fraction, 


27 /e(e+1) 


f(a2*)/flet lz J=14 Fete ° 


however, we need to convert this into a form in which the numerator 2z7/c(c+1) 
is replaced by 1. So, multiply the quotient f(c; z*)/f(c+1; 27) by a factor A; 
to give 


Acz?/e(e + 1) 
f(c+ 1; 27)/f(c + 2; 27) 
AcAc4127/ce(e +1) 
Acai f (e+ 1; 27)/f(e + 2; 27) © 


Acf (ce; 2) /f(e +1; 27) = Ac + 


SAau (4.15) 
We want A,Aci127/c(e+ 1) = 1, that is, A-Acy1 = c(e + 1)/2?, and it will 
suffice to take A, = c/z. Making this choice and substituting into (4.15), we 
have 


é J(G27).._ 6 1 
2 fetus) 2° cet fe+h) 
z f(c+2; 2?) 


But now the complete quotient on the right-hand side can be treated in the 
same way, and the whole process repeated k times, to yield 


e figz7) _ [fe e+1 et+k—-1 ct+k_ f(e+k;2z?) 
z f(c+1;2?) gO go Zz "2 f(e+k+1;2?) 
as claimed. 


Theorem 4.19. Let c and z be real numbers with c>0 and z #0, such that 
(c+k)/z is a positive integer for allk > 0. Then 


ce i. ae eet) e2 
a oe 8 me ee 


z f(e+1;22) | 2’ , 


Proof. Using the previous lemma and the uniqueness result Lemma 4.7, all 


we need show is that 
c+tk f(e+k;27) 


z f(et+tk+1;2?) 


(4.16) 
for k > 0. But we have 


(c+k +1) =(ce+k+1)(ce+k+2)--- (c+ 2k) 
>(c+k)(e+tk+1)---(e+2k-1) 
= (c+ k)®) ’ 
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and noting that z? is a positive real number, it follows that 


2 zak RS 1 2k . 
k; —__——__~ —_ = k+1;2°). 
f(e+k; 2? = eT Hw =a ( Er > LET ET DO al f(e+k+1;2z*) 
Since by assumption (c+ k)/z > 1, the inequality (4.16) holds and the result 
is proved. 

Finally we must find c and z such that the conditions in the above result 
hold, and also such that the series representing f(c;z?) and f(c +1; 2?) can 
be summed without excessive difficulty. We need c/z and 1/z to be positive 
integers, say c/z = p and 1/z = q; then 


P ay ppt a pt 24, pt 3a...) 
oq 


Moreover, if k > 0, we have 


1\ (k ee i 4) (1)($)(2)--- (k- $)(k 
(4) =) @) Gp - WOWO te ) _ Coy 


and so 


Ss 
— 
wl 

R 

ie] 
ee 
ll 
Me 
—| bo 
~) & 
=| = 
=| “ye 
— > 
ll 
Me 
PN ey 
bo 
| 
Sz] & 


= cosh2z. 


Similarly f (3; 2?) = (2z)~1 sinh 2z; taking g = 2p and combining these results, 


cosh = 5 f(s: a7) 
sinh = fGig) 


We have proved the following result. 
Theorem 4.20. For any positive integer p, we have 


1 1 1 


if 
coth — = p+ —— —— —— ; 
Dp 3p+ 5p+ Tpt+ --- 


Corollary 4.21. The numbers 


1 
coth — , 
p e—1 
where in the first p is a positive integer, are neither rational numbers nor 
quadratic irrationalities. 


Proof. The first claim is true because coth1/p has an infinite non—periodic 
continued fraction. The second number given is in fact coth 4. If e were rational 
or a quadratic irrational, then (e + 1)/(e — 1) would be too. 
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Comment. The function f(c; z) employed in the above proof is closely related 
to the hypergeometric function, defined by the series 


oy en a(at1)--- (a+ k—-1)b(64+1)--- (b+ K-1) 2* 
AG) — eae pay. — al 


for |z| < 1. Here a,b and c are real parameters with c 4 0,—1,—2,.... The 
hypergeometric function is a solution of the differential equation 


2(1—2z)y” + (e-(a+b+1)z)y’-—aby=0. 


A large number of important functions are special cases of the hypergeometric 
function. For example, 


— 1 

F(1,1)1;2)= S72" =7 ; 
k=0 me 
= emg kel 

F(-n,1;1;-2) =) PROD gn forneN, 
k=0 , 
OO k+l 

2F(1,15252) =) | az = —log(l— 2) 5 
k=0 


while by taking c = 4 in the recurrence (4.14) we find (3)/($) = 2k +1 
and hence 


1\(k acd 2k+1 
5) 6) glk ; 


—1)k,2kt1 —1)k —t — : 
5 Cast = Sa = tans 


2F (5, 1; 3; 


| 

Q 

to 

—" 

lI 
iv]3 
es, | 
wee 
hig al 

ca 


The functions cosh and sinh cannot be obtained by making a simple substi- 
tution of particular values for a,b and c; however, we have 


oO (k) p(*) 92k pl y2kt+1 
3 ‘ Be g8 = é a : t 
ge a (1m Se) (2m. Fe) aes A 
oe g2ktl 


_ Qk+ 1D! 
rm (2k + 1)! 
= sinhz , 


it being possible to justify the interchange of sum and limits. 


By using the continued fractions found above we may compute the contin- 
ued fraction for e itself. 


Theorem 4.22. The continued fraction of e is 


1+ 24+ 14 14 44 14 14 64 14 .::-0 
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Proof. Let the convergents of the continued fractions a = [2,6,10,14,...] and 
6 = [1,1,2,1,1,4,1,1,6,...] be px/q, and r,/s, respectively. The kth partial 
quotient of the former is 44 + 2, and so we have 
De = (4K + 2)pe—-1+ Pr-2, Ok = (AK + 2)qn—1 + Ge-2 - 
The partial quotients of the second continued fraction are 
2k ifm = 3k—-1 
a. = 

. 1 otherwise, 
and so for any k > 1 we can write 

T3k+1 = 13k + 13k-1 

3k = T3k—-1 7 T3k—2 
T3k—1 = 2kEr3p—2 + T3k—-3 


T3k—-2 = T3k—-3 1 T3k—4 


'3k—-3 = T3k—4 1 T3k—-5 - 


Eliminating r3x,73k-1,73k—3 and r3x—4 from these equations gives a recur- 
rence involving r3x41,73k—2 and r3z—5; the same recurrence is satisfied by the 
corresponding denominators, and we have 


r3kt1 = (4k +2)r3p-2 +7 3n—5 , $3k41 = (4k + 2)834-2 + S3e-5 - 
Since pz, and gq, satisfy the same relations, a little attention to initial values 
and an easy induction shows that 
T3kt+1 =2qk and s3k41 = Pk — Ik 


for all k > —1. From Theorem 4.20 we have a = coth$ = (e+ 1)/(e — 1); 
therefore 
: 2 2 
6 = lim Ss ee Oe ii ne =e-l1, 
k-0o Sk k-o00 S3k41 k->oo Dk — Ik a-—1l 


from which we obtain the continued fraction for e. 


The continued fractions for e and related numbers were first determined 
by Euler ({25]; English translation in [70]). He begins with a certain differen- 
tial equation, and subsequent calculations involve an expression which in the 
notation used here is f(1 + i; =), Earlier in the same paper, he states that a 
number is rational if and only if its simple continued fraction terminates; from 
which it may be deduced that e is irrational. Curiously, Euler does not explic- 
itly draw this conclusion; perhaps he felt that it was too obvious to be worth 
writing down. An interesting historical discussion of [25] may be found in [21]. 
A different method of calculating the continued fraction of e was published by 
Hermite [31] in 1873; it involves an integral very similar to the one we have 
used in Chapter 2 to prove the irrationality of e". Accessible expositions of 
Hermite’s method are given in Olds [47] and Cohn [20]. 
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EXERCISES 
122 
4.1 (a) Find the continued fraction of =. 
(b) Find the continued fraction of ,/a? + a+ 4, where a is a positive 


integer. 


(c) Evaluate the eventually periodic continued fraction 


2 de De Oe a ee Be De a ee 


4.2 Consider the continued fraction 


a= ao + 

ay+ --: An 

where all a, (including ag) are positive integers. As usual, write pz /qx 
for the convergents to a. 


(a) Find the continued fraction of pp/pn—1. 


(b) Show that the sequence of partial quotients is palindromic (that 
is, do = An, 41 = Gp_, and so on) if and only if pp_1 = dn. 


4.3 Simplify pesode — Prde+2 and pr+3qk — Prdr+3- Generalise. 


4.4 Prove Euler’s rule for computing the convergents of a continued frac- 
tion. Write down the product aga,---a,, and all products which can 
be obtained by deleting any number of pairs axaz41 of adjacent factors 
from this product; if n is odd, one of the products obtained contains no 
factors and is taken to be 1. Let p,, be the sum of all these products. 
Find gq, by applying a similar process to the product a,a2---a,. Then 


1 1 1 
Pn _ ao + ee 
n ay ate at an 
4.5 Let ao be an integer and aj, a2,... positive integers. Evaluate the matrix 


product 
ao 1 ay 1 fas ak 1 
1 0 1 0 1 O} ’ 
and hence give a proof, different from that in Lemma 4.1, of the relation 
Pk—19k — Pkedx—1 = (-1)*. 


4.6 Evaluate 


1 1 1 1 1 
ao + ae 
Qi) + ++: Apt Qp+ s+: ayt+ ao 
in terms of the partial numerators and denominators of [ao,a1,..., ax]. 


Ensure that your answer is given as a fraction in lowest terms. 
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4.7 Consider the periodic continued fraction 


1 1 1 1 1 1 
ay+ +++ Gn-1+ 2ag+ Gi + +++ Gn-1+ 2agt+ -:: 


a@=agt+ 


where the partial quotients ag, @),...,@n,—1 are positive integers and the 
sequence @1,...,@n—1 is palindromic. Prove that a? is rational. 


4.8 For any quadratic irrational € = x + ,/y with z,y € Qand /y ¢ Q, 
we write €* for the conjugate of €, in the sense of Chapter 3: that is, 
é* =a— Jy. The aim of the present exercise is to prove that a quadratic 
irrational a has a purely periodic continued fraction 


1 1 1 1 1 
(05 is aa 6 Po ec ao + Qy+ -+: Qn-1+ aQgo+ -:: 


Qa@=agt+ 


with all a, positive integers, if and only if a > 1 and -1 < a* <0. 


(a) If aw has the given continued fraction, explain why a > 1 (easy). 
Find the minimal polynomial of a in terms of the numerators and 
denominators of convergents, and use it to show that —1 < a* <0. 


(b) Now suppose that a > 1 and —1 < a* < 0. We know from Theorem 
4.9 that the continued fraction 


is ultimately periodic: that is, there are integers n and mo such 
that @m4n = Gm whenever m > mg. As usual we write ax for the 
kth complete quotient of a. 


First, show that for all k > 0, we have ay, > 1 and —-1 < a% <0. 
(c) Next, prove that for every k, we have 


well: 


K 
R41 


where |-| is the floor or greatest—integer function. 
(d) Finally, show that @m,—14n = @m,—1, and explain how this proves 
the stated result. 


4.9 Let d be a non-square positive integer. Show that if x,y are positive 
integers and x? — dy? = +1, then «/y is a convergent to Vd. 


4.10 (a) Let a1,a2,...,a, be positive integers, and let px/qx, be the kth 
convergent (in lowest terms) of the continued fraction 


1 1 


ay + ag + ++: An 
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Show that if a, > 1, then the continued fraction 
1 1 1 1 1 1 


a= Sa ee a a a ~~ 
ay+ +++ Gn-1t+ (@n +1) + (@n —1)+ An-1 + +++ @y 
is equal to 
—])” 
Pn, | 1) 
dn Gn 


(b) Show how one may find, without computing assistance, the partial 
quotients of the “decimal” 


=* ll 
8 = 0.110100010000000100--- = S° — 
=o 9 


in base g > 2. Illustrate your solution by finding the 2345th partial 
quotient of 2. 


(c) Find the exact order of approximability of 6. That is, find s such 
that (6 is approximable to order s, and to no higher order. 


(d) What happens if g = 2? 
4.11 Prove that the continued fraction 
1 1 1 
101! + 102! te 102! 4 «+. 


represents a Liouville number. 


Qg=> 


4.12 (a) Prove Kronecker’s approximation theorem: for any real irrational 
a, any real § and any positive real € there exist infinitely many 
pairs of integers p,q with gq > 0 such that |ga —p— 6| <e. 


(b) Show that the following is equivalent to Kronecker’s Theorem: for 
any real irrational a and any real ;, 82 with 6, < 2, there exist 
infinitely many pairs p,q with q > 0 such that 

Bi<qa-p< fp. 

(c) Use the result in (b) to prove that there is a power of 2 beginning 

with any given string of decimal digits. 


4.13 Use continued fractions to “explain” the approximation 


, [2148 
mms {/ = = 3.14159265258 --- , 


discovered by Ramanujan. Also the approximation 
16\2 
ne (2) = 3.16049--- : 


this is implicit in the Rhind papyrus from ancient Egypt, which was 
copied in about 1500 B.C. from an earlier source. 


4.14 


4.15 


4.16 


4.17 


4.18 


4.19 


4.20 
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Let a be irrational. Prove that at least one of any two consecutive con- 
vergents to a satisfies the inequality 


To ten decimal places we have m!7/¢(12) = 924041.7872648336. Use 
continued fractions to conjecture a rational value for 7!?/¢(12). 


Comment. To get a convincing answer you will need to use a calculator 
with at least 10 digits accuracy. 


For any n > 1, find the best possible constant A, such that the following 
result is true: if a is a real irrational number with infinitely many partial 
quotients a, > n, then the inequality 


1 
Ang” 


has infinitely many rational solutions p/g. 
Show that the polynomial 
f(z) = 80(z — 1)(10z — 11)(9z — 10) —1 
= 7200z? — 2312027 + 24720z — 8801 


has three real positive roots, and find the first seven partial quotients of 
the continued fraction of the middle one. 


Let a = tan(7/5). Use the minimal polynomial of a to calculate some 
of the partial quotients in the continued fraction of a, and hence find a 
rational number p/q such that 


im(§) - F< 
n(—)—= F 

5 q 50q? 
Show that if k > 2, then 


| {. 2 1 1 1 1 
e = rare ooo eae ere Or oO 2 
k-14+ 14+ 14 3k—-14+ 1+ 14+ 5k-14 =. 


Let a be an irrational number with partial quotients a,; suppose that 
there exists a constant A such that a, < Ak whenever k > 1. Show that 
there is a constant c such that 
Dp c 

a-=-|>=—— 

| A 7 log q 
for all rational numbers p/q with q > 1. Deduce that any such a, and 
in particular a = e, is not approximable to order greater than 2. 
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4.21 


4.22 


4.23 


Let v be a positive real number. The modified Bessel function of the first 
kind of order v (compare problem 2.6) is defined by the power series 


ee 1 ay 2k+v 
ae - Law +ke) (5) 


the factor T(v + k + 1) in the denominator is a value of the gamma 
function, which has the property 


T(a +1) = aI (a) 
for all x > 0. Express the continued fraction 


i 1 1 1 1 
2+ 34+ 44+ 54 -- 
in terms of modified Bessel functions. 


An alternative derivation of the continued fraction for e. Let a be the 
real number with continued fraction 


write a, for its partial quotients and p,/q, for its convergents, and for 
any k set ry = pr — dre. Define 


1 k+1( ape 
ve Hb 
K, = —__—_—_———- e* dr. 
| (k+D! °° 


Show that 
Ty =T3r-1, Jr=7r3r, Ke = —T3r41 


for k > 0, and that I, J,, Ky > 0 as k > 00; deduce that a = e. 


Using continued fractions to break an RSA code. In RSA encryption, a 
modulus n and an exponent e are made available publicly. A message 
m is encoded as c = m° (mod n). The modulus is the product of two 
large primes, n = pq, where p and q are kept secret. Those who know 
p and q can calculate ¢(n) = (p — 1)(q — 1) and decode the message 
by calculating m = c4 (mod n): here d is the inverse of e modulo ¢(n), 
that is, de = 1 (mod ¢(n)). It is commonly asserted that the system 
is safe because finding d is “essentially” equivalent to factorising n, a 
computationally difficult task. However, care must be taken with the 
choice of d. 


4.24 


4.25 


4.26 
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(a) Prove that if e < (p—1)(q—1) and p < q < 2p and (3d)4 < n, 
then d is the denominator of one of the convergents to e/n. 

(b) Suppose that the conditions in (a) hold. Explain why there are 
only, more or less, logn candidates for d, and why this makes the 
encryption insecure. 


(c) Implement these ideas in the following small-scale example. Sup- 
pose that the public parameters of the code are 


n = 376146669038857 and e = 7654913878769 


(though in a real-life situation, n would have 200 digits or more). 
If p and q satisfy p < q < 2p, and if d < 1467, break the code by 
determining d. 


Consider an a x b rectangle, with a < b. By reducing such a rectangle 
we mean cutting off as many as possible a x a squares, beginning at the 
side of length a. Suppose we begin with a rectangle of size 1 x a, with 
a > 1; reduce it (we call this the Oth step) to obtain a smaller rectangle; 
reduce this (the 1st step) to obtain a smaller rectangle again; and so on. 


(a) Determine how many squares are cut off at the Ath step. 
(b) Find the size of the rectangle remaining after the kth step. 


(c) If a = 1.2345, draw an accurate scale diagram of the process up to 
the fourth step. 


“We are going well,” said [Sherlock Holmes], looking out the window 
and glancing at his watch. “Our rate at present is fifty-three and a 
half miles an hour.” “TI have not observed the quarter—mile posts,” said 
[Watson]. “Nor have I. But the telegraph posts upon this line are sixty 
yards apart, and the calculation is a simple one.” (Sir Arthur Conan 
Doyle, The Adventure of Silver Blaze.) 


What, exactly, do you think was the calculation performed by Holmes? 
Give reasons for your opinion. 


An investment company advertises (Sydney Morning Herald, 7 Septem- 
ber 2002) “the potential to [earn] over 20%”. A footnote explains that 
“20% or more was achieved in 29.41% of simulated tests”. Can you find 
any reason to doubt the integrity of this company? 


APPENDIX 1: A PROPERTY OF POSITIVE FRACTIONS 


A simple property of positive fractions. If a,b,c and d are positive and 
a/b is not equal to c/d, then 


a+c 
b+d 


lies (strictly) between a/b and c/d. 
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APPENDIX 2: SIMULTANEOUS EQUATIONS WITH INTEGRAL 
COEFFICIENTS 


Let a,b,c,d and p,q be integers. If ad — bc = +1, then the simultaneous 
equations 
{ ax + by = p 
cx +dy=q 


have an integral solution x, y. Conversely, if the system has an integral solution 
for all integers p,q, then ad — be = +1. 


Proof. The solution can be written 


x\ (a b\ (p\ 1 d -b\(p\ 1 (dp—bq 
ep Wed q}) ad—be\-c a/\q})  ad—be \aq— ep 
provided that ad — bc # 0. It is clear that if ad — be = +1, then a and y are 
integers. Conversely, suppose that |ad — bc| > 1 and consider the solutions 


when p = 1, gq = 0 and when p = 0, gq = 1. If these solutions are to be integers, 
then ad — bc must be a factor of a,b,c and d. But this leads to 


(ad — bc)” | ad — be , 


which is impossible. Finally note that if ad — bc = 0, then there exist p,q for 
which the system has no solution at all, and therefore certainly no integral 
solution. 


Exercise. Let A be ann xn matrix with integral entries. Show that the linear 
equations Ax = b have a solution x with integral components for all integer 
vectors b, if and only if det(A) = +1. 


APPENDIX 3: CARDINALITY OF SETS OF SEQUENCES 


Theorem 4.23. Let A be a set with more than one element. Then the set 
S = { (a0, 41, 42,...) | ax € A for all k} 
of sequences in A is uncountable. 


Proof. Suppose that S is countable; then by definition (see Chapter 3, ap- 
pendix 1) there is a one-to-one function f from S' to N. Let « and y be distinct 
elements of A. Since f is one-to-one each k € N is the image of at most one 
sequence in S', and so there is a well-defined sequence (bo, b1, b2,...) given by 


b = 22 ifk = f(ao,@1,@2,...) and a, = y 
ane y otherwise; 
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note that the “otherwise” includes both the case where az is an element 
of A other than y and the case where k is not in the range of f. Clearly 
(bo, b1, b2,...) isin S', and so f(bo, b1, b2,...) is equal to some natural number 
k. But this is impossible since from the definition we have by = x if by = y, 
and by = y if bk 4 y. We have a contradiction, and the result is proved. 


APPENDIX 4: BASIC MUSICAL TERMINOLOGY 


Some musical terminology, for readers who may not already be familiar with 
it. Consider a piano keyboard, part of which is shown in figure 4.3. The white 


Ct Dt =F Gt At 
D> Eb ~~ Gb Ab Bb 


C DE FGABC 


Figure 4.3 Part of a piano keyboard. 


keys are labelled with the first seven letters of the alphabet: A, B, C, D, E, 
F, G. After using all of these we start again. Note that A follows G, and that 
C appears at both the left and right-hand ends of the diagram. A black key 
is given the name of the white key just below it, with a sharp (f) added; or of 
the white key just above, with a flat (b) added. Thus the leftmost black key 
in the diagram is called Cf or Dp, pronounced “C sharp”, “D flat”. 


The interval from any key to the next is called a semitone. This is the 
smallest interval used in the majority of traditional Western music. For exam- 
ple, C-—Ct, Gf-A, B—C are all semitones. The interval from any key to the 
next key of the same name (for example C-—C, Eb—Eb) is called an octave. 
Counting five steps up a scale (including the first and last notes) gives the 
interval of a fifth, sometimes, for emphasis, called a perfect fifth. (Musical 
readers will know that there are other kinds of fifths, but we shall not be 
concerned with them here.) Instances of perfect fifths are C-G, B—Ft and 
Gb—Db. 


Musical sounds are caused by regular vibrations in the air (or in other 
media). The number of vibrations per second causing any particular note is 
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the frequency of that note in units of Hertz (Hz). For example, the modern 
standard of orchestral pitch is established by defining the note A above middle 
C to have a frequency of exactly 440 Hz — that is, 440 vibrations per second. 
It is found by observation (and backed up by psychological and physiological 
theories) that when two notes of different pitches are sounded simultaneously 
or consecutively, the result is most pleasing to the ear if the ratio of the 
frequencies of the pitches is a simple fraction. The simplest possible fractions 
are = and 3, and these correspond to the intervals of the octave and the perfect 
fifth respectively. Middle C, for example, has a frequency of 262 Hz; a perfect 
fifth above is G with a frequency of 393 Hz; the octave above middle C is the 
C with frequency 524 Hz. 


The above is adapted, with permission, from the present author’s arti- 
cle [3]. For accessible reading on acoustical aspects of music, the classic text 
is that by Sir James Jeans [34]. Very much more detailed information may be 
found in [12]. 


CHAPTER 5 


Hermite’s Method for 
Transcendence 


Be it enacted by the General Assembly of the State of Indiana: 
It has been found that a circular area is to the square on 
a line equal to the quadrant of the circumference, as the 
area of an equilateral rectangle is to the square on one side... 
The present rule... is entirely wrong... 


House Bill No. 246 (1897), State of Indiana 


Now I, even I, would celebrate 

In rhymes unapt, the great 

Immortal Syracusan, rivaled nevermore, 
Who in his wondrous lore 

Passed on before 

Left men his guidance 

How to circles mensurate. 


Adam C. Orr [48] 


A S POINTED OUT IN CHAPTER 2, it is often easy to prove the irrationality 
of a number specifically constructed so as to be irrational, but can be 
much harder to prove a given “naturally occurring” number irrational. The 
same remarks apply with still more force to the question of transcendence: we 
have already shown that the number 


CoO 


1 
= 3 Tom = 0.110001000000000000000001000 - - - 
k=1 


1JIn 1897 the parliament of the US state of Indiana was presented with a bill which 
would, in effect, have fixed by legislation the value of 7. If taken literally, the section quoted 
is equivalent to the formula A = ($c)? for the area of a circle, which implies that 7 = 4. 
The bill was passed unanimously by the Lower House and sent to the Senate, where owing 
to the fortunate intervention of a professor of mathematics, its further consideration was 
postponed indefinitely. See Petr Beckmann, A History of x [11], Chapter 17. 
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is transcendental, but this is hardly a number which one would expect to 
encounter in any other area of mathematics. In the present chapter we shall 
demonstrate the transcendence of two of the most important constants of 
mathematics, e and 7, and shall use the same techniques, but in a more 
complex way, to prove an important theorem of Lindemann which generalises 
both of these results. We shall also develop some properties of symmetric 
polynomials, which will be required in proving the transcendence of 7. 


5.1. TRANSCENDENCE OF e 


The proof of the transcendence of e is in fact not very different from Hermite’s 
proof of the irrationality of e”, though naturally the details are more compli- 
cated. As in Chapter 2 we’ll try to provide some motivation before giving a 
formal argument. One might expect the proof to be by contradiction, and so 
we begin by assuming that e is algebraic: thus, there is a polynomial identity 


™4 gq, ye™ 1 +.--+a,e+a9 =0, (5.1) 


where the coefficients a; are integers. Our arguments in Chapter 2 were based 
upon the integral formula 


[ro Je” dx = F(r)e” — F(0)e° 


for a certain function F’. If we try to employ the same expression in a transcen- 
dence proof, we find that the product F'(r)e” of two “variable” or “unknown” 
quantities causes difficulties. In Chapter 2, the assumption e” = p/q simpli- 
fied the formula to such an extent that we were able to complete the proof, 
for future work, however, it will be advantageous to have an expression of the 
form F(r)e° — F(0)e” in which we can deal with the two difficulties separately. 
To interchange the roles of 0 and r in the exponential we need only integrate 
a function of the form f(x)e”—* instead of f (a)e* 


We should also give some thought to the range of integration. Integrating 
from 0 to r was successful in Chapter 2 since we had information concerning e° 


(of course!) and e” (by assumption). In the present case our assumption (5.1) 


involves e°, e', e?,...,e™, and so we shall consider the m + 1 integrals 


k 
_ fl f(a)e’* de, 


where k = 0,1,2,...,m and f is a polynomial to be chosen later. 


Comment. Assuming integrability over a suitable interval, define the convo- 
lution of functions f and g to be the function given by 


eave ff Honea 
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Then (exercise!) the operation * is associative, (f *g)*h =f *(g*h), while 
the operation defined by 
(fF Og) a= fre) f(t) 


is not; this fact alone suggests that, despite appearances, * is a more “mathe- 
matically natural” way of combining multiplication and integration than is ©. 
If we denote the exponential function by exp, then the integrals we are now 
considering are certain values of f * exp, while those we used in Chapter 2 
were related to the less important f © exp. 

The convolution of two functions appears in the study of Laplace trans- 
forms; a slightly different type of convolution has connections with Fourier 
transforms. The Dirichlet product 


(f *g)(n “21 )o(4 ) 


of two arithmetic functions f and g is very important in number theory, and 
may be seen as a discrete analogue of the convolution. 


We return to the development of a transcendence proof for e. Integrating 
by parts, we obtain 
I, = F(0)e* — F(k) , 


where 
F(x) = f(x) + f(x) + f" (x) + 


Our transcendence proof will rely on the expression 


j= So ane = ~S~anF (6) 
k=0 k=0 


We shall aim to show by estimating the integrals I, that J is “small”, and 
by analysing the derivatives of f that J is a non-zero integer and therefore 
“large”; this will give the sort of contradiction that we have seen a number of 
times in Chapter 2, and will prove that e is not algebraic. 


To make F'(k) simple for k = 0,1,2,...,m we shall choose f(x) to have 
many factors of a —k for each k. Recall that (perhaps surprisingly) one of the 
more intricate aspects of the proofs in Chapter 2 was the necessity of showing 
an integral expression such as J to be non-zero. Our earliest proofs relied on 
the fact that the integrand was positive; but here we know essentially nothing 
about the coefficients a,, and so this method appears unlikely to succeed. 
We take inspiration, instead, from the argument employed in the proof of 
Theorem 2.5 (page 25), where we proved that a certain expression was non— 
zero because it was not a multiple of (n + 1)!. This worked because f was 
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divisible by a high power of x, and an even higher power of «—r. So we shall 
set 
f(z) =2"(e@— 1)" (@— 2)". .- (a —m)rr 


where, as usual, n is to be chosen later. 


The above ideas will suffice to prove e transcendental. We shall need the 
lemma on derivatives of polynomials from Chapter 2; for convenience of ref- 
erence we restate (a particular case of) this lemma. 


Lemma 5.1. Derivatives of polynomials. Let a be an integer, n a non- 
negative integer, and g a polynomial with integral coefficients. Define the poly- 
nomial f by 

f(x) = (@—a)"g(x) 


Then for all 7 > 0, the derivative f(a) is an integer divisible by n!. 


Theorem 5.2. (Hermite, 1873). The exponential constant e is transcendental. 


Proof. Suppose that e is an algebraic number of degree m, and therefore 
satisfies an algebraic equation 


Ame” + am—1e" | +--+ +t aje+ag =0. (5.2) 


Without loss of generality we may assume that the coefficients a, are rational 
integers with ag # 0. For any positive integer n let 


f@)=2°@—1P"@—2)" (=m 5 
and define : 
Ty =| f(ax)e*—* dx 
for k = 0,1,2,...,m. Integrating repeatedly by parts (exercise!) we obtain 
I, = F(0)e* — F(k) , (5:3) 


where 

Faj=feltf ees et. 
For k = 1,2,...,m the lemma shows that (n +1)! | f%(k) for all 7 > 0, and 
hence that (n+ 1)! is a factor of F'(k). To handle the case k = 0 we note that 


f(x) = (-1)™"*) (m!)"+1 2" + {higher order terms } 
and so 
f (0) = 9! x { coefficient of x in f(x) } 
0 ifj<n 
= ¢ (-1)™"D (mtn! ifj=n 


a multiple of (n+1)! iff >n. 
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Now set 
J =aolo + ayh, + aglg +--+ +amIm - 


Using equations (5.2) and (5.3), we have 


.e— 3 ax, (F(0) e* — F(k)) = — 3 apF(k) , 
k=0 


k=0 
and the divisibility properties we have just proved show that 
J = (-1)™™t™ T (m!)"* tn! ag + {a multiple of (n+1)!} . (5.4) 


On the other hand, we can find an upper bound for J by using the usual sort 
of integral estimate. First, observe that the range of integration for every I; 
is a subset of the interval 0 < « < m. Thus for all relevant « the polynomial 
f(x) is a product of mn +m-+n factors, each of absolute value at most m, so 


k 
|x| < |f(x)|e*-* dz < kmmntmtn ok a minntm+n+l em 
0 


and hence 
m m 
| J| << S lane <S (Sax) (ort) 
k=0 k=0 


Since we have assumed e to be algebraic, its degree m and the coefficients of its 
minimal polynomial are fixed numbers, independent of n, and this inequality 
can be written 

| J| < ab” (5.5) 


with a and b independent of n. To complete the proof, choose n such that 
n+ 1 is a prime number greater than both m and |ao|, and large enough 
that n! > ab”. Then (5.4) shows that J is an integer which is divisible by n! 
but not by (n + 1)!; thus |J| > n! > ab”, which contradicts (5.5). Therefore, 
the assumption that e is algebraic is untenable, and we have shown that e is 
transcendental. 


Corollary 5.3. [fr is a non-zero rational number, then e" is transcendental. 


Proof. Let 3 = e” = e?/4. If 8 is algebraic, then e is a root of the polynomial 
equation 
2? — BY =0) 


with algebraic coefficients; hence, by Theorem 3.12, the number e is algebraic. 
But we have just shown that this is not so. 


Comment. Another interesting question is whether or not e® need be tran- 
scendental for an algebraic number a # 0. We shall return to this later. 
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5.2. TRANSCENDENCE OF z 


We now turn to proving the transcendence of 7. Some features of the proof will 
be very similar to the one we have just done: the underlying reason for this is 
that 7 is closely connected with the exponential function by virtue of Euler’s 
formula e** = —1. To take advantage of this connection, however, we shall 
have to consider not only real but also complex algebraic numbers. The main 
additional difficulty we shall encounter will be in constructing the polynomial 
f and hence the integrals I;,. We shall assume that 7 is algebraic, and use this 
assumption to construct a polynomial with known roots; but we shall need 
to prove that this polynomial has rational coefficients. To do so we shall need 
certain facts about symmetric polynomials; these facts are connected with the 
well-known relations between roots and coefficients of a polynomial. 


5.2.1 Symmetric polynomials 


Definition 5.1. A symmetric polynomial f in m variables is a polynomial 
with the property that if (y1,Yy2,---;Ym) is a permutation of (%1,22,..-,2m); 
then 


PGi Ua .25 es) = Fis Page) : 


The elementary symmetric polynomials in m variables are the polynomi- 
als ex fork =0,1,2,...,m, where e,(21,@2,...,@m) is the sum of all products 
of k distinct variables from { a1, @2,...,@m }-. 


Examples. The elementary symmetric polynomials in 71,72, 73,24 are 


eo=1l, ey=ay,t+ag+234+%4, 
€2 = ©1%g +4103 +4104 + LoX3 + LoX4 + ©3L4 , 


€3 = ©1X9U3 4+ %109%4 + X1X3%4+7%9X3%4 , C4 = 11 XQX3ZX4 . 


Polynomials such as 


Bp 8 go 8) aed 
f(@1,%2, 23,04) = 2{ +2403 +24 — 7x1 — 7x2 — 7x3 — 724 


and 
2.2 22 Dee oe 24,2 
L{LAX, + ULL, + L{XQXZ + UZ LZL 4 
2 a ye 2.2 229 
f (1, %2,03,04) = 4 + @12_QX4 + TpLgx4 + @1 x3 + LQTZL4 (5.6) 
2,2 25, 2 2,2 2,,2 
+ U1 X5U4 + LQU%3ZX4 +L XZU4 + LQXZX] 


are not elementary symmetric polynomials, but they are symmetric because 
any reordering of the variables will return the same expression. The polynomial 


2,2 2.2 2,2 2,2 
f (@1, £2, 23,04) = £{2ZX3 + 050504 +: 1324, + L42{Lo 


is not symmetric since f(x1, 22, 23,04) # f (#2, 21,13, 04). 
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The elementary symmetric polynomials, as suggested above, form the basis 
of the well-known relations between roots and coefficients of a polynomial. 


Lemma 5.4. Let €9,€1, €2,---,€m be the elementary symmetric polynomials 
in Q1,A2,...,Am. Then 


(1+ 04)(1 + @e)>+* (1+ am) = en + er + €2 +++ em. 
and for any x, we have 
(2 — o1)(2 — ag)-++ (2 — am) = 2™ — eya™ + + eg? — .-. + (-1)™em . 


Proof. For the first identity, just stare at it until it becomes obvious! Then 
replace a, by —ax/x, observe that 


a a Qm _k 
ex(-,-,...,- =) = (—1)*x Ken (ay, Q2,---,Qm) ; 


and multiply through by 2; this proves the second result. 


One reason for the importance of the elementary symmetric polynomials 
is that they can be used to write an expression for any symmetric polynomial. 


Theorem 5.5. Symmetric polynomials. Let f be a symmetric polynomial in 
@1,U2,...,%m, with coefficients in R, an additive subgroup of the real numbers. 
Let ex be the elementary symmetric polynomials in x1, %2,...,%m. Then there 
is a polynomial g in m variables, having coefficients in R and degree at most 


that of f, such that 
f (21, %2,---,2m) = g(€1, €2,---,€m) - 


Comment. More tersely stated: any symmetric polynomial is a polynomial 
of the same or smaller degree in the elementary symmetric polynomials; and 
if the former polynomial has coefficients in R, then so does the latter. 
Proof by induction on n, the degree of f, and m, the number of variables. 
The result is trivial ifm = 0 or m = 1; let f be a symmetric polynomial of 
degree n in &1,...,%m, having coefficients in R, and assume that the result is 
true for all polynomials with degree at most n in at most m variables (with 
at least one of these inequalities strict). Let 


f* (x1, “4 ig Bm—1) = f (21, fs ima t, 0) 5 


and similarly write 

ree 4 cr ene eee) 
it is not hard to see that the ef are in fact the elementary symmetric polyno- 
mials in 71,...,%m_1- 


Now since f is symmetric, f* is also symmetric; it has fewer than m 
variables and degree at most n, and so by the inductive assumption there 
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is a polynomial g* with coefficients in R, such that f* = g*(ef,...,e%,_1). 
Consider the polynomial 

f(@1,-.-,2m) — g* (e1, ee ee 
which is clearly symmetric in 71,...,2 ». This polynomial has x,, as a factor; 


since it is symmetric, each other x, is also a factor, and hence e,, is a factor. 
Thus 

f(a1,...,2m) — 9" (€1,--.,€m—1) = emh(a1,...,2m) , 
where h is again a symmetric polynomial. Since e, and e; are of equal degree 
for k = 1,2,...,m-—1, we have 


deg(g*(e1,-..,€m—1)) = deg(g* (ej, ...,em_1)) = deg f* < deg f , 


where in the first two terms the degrees are in terms of the variables 
X1,--.-,%m, and so h has smaller degree than f; also, h has coefficients in 
R. By induction we may assume that h is a polynomial in e1,...,em with 
coefficients in R, and since g* also has coefficients in R, so does 


g(e1,---;€m) = g (e1, .ee3@m—1) + @mh(X1,..-,Lm) - 


Finally, every ex has degree at least 1 in the variables 71,...,2%m; so g cannot 
have degree exceeding that of f, and this completes the proof. 


Comments. 


e In applying the above theorem we shall mainly be interested in three 
cases: R = Q, the set of rational numbers; R = Z, the set of integers; 
and R = cZ, the set of multiples of a fixed integer c. 


e The proof provides an algorithm for expressing a symmetric polynomial 
in terms of elementary symmetric polynomials. For example, take f to 
be the polynomial in four variables given by (5.6) on page 114. Then 


Kk 209 2, 2 2.2 
JO = 2 XQX3 + Vj LyX3 + V1 L973 , 


and it is easy to see that f* = e}e3. (In a more difficult case we would 
iterate the algorithm to give f** = 0, and so by the theorem f* = e3 h* 
with h* = xyx2 + 21x23 + vox3; if it were not clear how to express h* 
in terms of elementary symmetric functions we would now apply the 
algorithm to hs) Therefore, the theorem tells us that e4 is a factor of 
f — e2e3; performing the algebra, we find 


f —ege3 = —3a7 2,024 — 321232504 — 34,2042, — 321292573 
= —3e4(21 + %2+%3 + x4) 
= —se1ea , 
and so 


f = e2€3 — 3e1e4 , 


which, as claimed, is a polynomial in the four variables e€1, e2, e3, e4. 
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e For a different algorithm (and, consequently, a different proof of the 
preceding theorem) see Stewart and Tall [62], pages 24-27. 


Corollary 5.6. Evaluations of symmetric polynomials. 


e Let f be a monic polynomial of degree m with integer coefficients. Sup- 
pose that f has roots a1,Q2,...,Q@m (including repeated roots, if any) 
and let c be a symmetric polynomial in m variables with coefficients in 
R, an additive subgroup of the real numbers. Then c(ay,a2,-.-,Qm) ts 
an element of R. 


e If R is Q, or indeed any subfield of R, then the above holds for all 


polynomials f, monic or not. 


Proof. To prove the first statement we note that from the preceding theorem, 
C(Q1,Q2,-.-;@m) is a polynomial in the elementary symmetric polynomials 
of Q1,Q2,...,Q@m, having coefficients in R; but these elementary symmetric 
polynomials are (up to sign) the coefficients of f, and hence are integers. 
Finally, a polynomial with coefficients in R, evaluated at integer arguments, 
is a sum of integer multiples of elements of R, and hence belongs to the same 
subgroup. 

The argument for the second statement is almost identical, only noting 
that in this case the elementary symmetric polynomials of a1, Q2,...,Qm are 
the coefficients of f, divided by its leading coefficient, and hence are rational; 
and any rational multiple of an element of the field R is also in R. 


5.2.2 The transcendence proof 


The only further preparation we need before embarking on the transcendence 
proof for z is to recall that the conjugates of an algebraic number a are the 
roots of its minimal polynomial; if a is of degree m, then it has m conjugates, 
one of which is a itself. 


Theorem 5.7. (Lindemann, 1882): 7 is transcendental. 


Proof. Suppose that a is algebraic; then iz, being a product of algebraic 
numbers, is also algebraic. Let m be the degree of iz, and let the conjugates 
of im be ay, Q@2,...,Qm. We have 


(e“+ + 1)(e°? +1)---(e°" +1) =0 (5.7) 


because one of the factors is e*” + 1. Expanding the left-hand side we obtain 
a sum of 2” terms e?5, where for any S C {1,2,...,m} we set 


Bs => oan. 


kes 


118 @ Irrationality and Transcendence in Number Theory 


That is, the values of {gs include all ax, all sums az, + ax, with ky # ko, and 
in general all sums of any number of distinct a,;. Note that this includes the 
empty sum (6g = 0, which corresponds to the product 1 x 1 x --- x 1; it is 
possible that other sums (5 are also zero. 

Let g be the monic polynomial of degree 2” whose 
roots are the numbers (g, including any multiplicities. 
That is, 


gz)= J] (-8s). 


SC{1,2,...,m} 


By Lemma 5.4, the coefficients of g are elementary sym- 

cogs roar neal metric polynomials in the sums ({s, and therefore can 

be written as polynomials in the numbers ax. If the az 

are permuted in any way whatsoever, the expansion of the left-hand side 

of (5.7) will contain the same terms; therefore the 6g will remain the same, 

though possibly in a different order; and so the coefficients of g will be un- 

changed. That is, these coefficients can be written as symmetric polynomials 

in Q1,Q2,...,Q@m. But because all of the ay, are conjugates, they are the roots 

of a single polynomial with rational coefficients, and so, by Corollary 5.6, the 
polynomial g also has rational coefficients. 


Suppose that g(z) has a factor of z with multiplicity s; since Gg = 0 is 
a root of g, we know that s > 1. Divide g by z* and multiply by a common 
denominator for its coefficients to obtain a polynomial 


A(z) = hez’ + y_iz* * +-+-+hizt+ho 


of degree t = 2” — s, with integral coefficients, of which h; and ho are 
non—zero. We relabel the non-zero values of (5, including any repetitions, 
as 31, 22,..., 4; these numbers are therefore the roots of h. Expanding the 


left-hand side of (5.7), we have 
eft 4 ef2 4...4 08% +5=0. (5.8) 


From now on the proof follows closely the transcendence proof for e. For 
any positive integer n, write 


f(z) =2"h(z)"™ , 


a polynomial with integral coefficients and degree less than (n+ 1)(¢+1), and 
let 
. B 
Ig = / f(z) 8? dz 
0 


for any 8 € C. Note that the integrand of Jg is an entire function. Therefore, we 
need not specify the path of integration and may take it to be the straight line 
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from 0 to 8; moreover, we can compute the integral by using an antiderivative 
of the integrand. As we have seen many times already, 


Ip = F(0)e* — F(8) , (5.9) 


where F(z) = f(z) + f’(z) + f’”(z) +--+. Now consider 


J= Ip, +1e,+-+:-+Ie,=>_ Ip, - 


Using the results (5.9) and (5.8), and the definition of F', we have 


f= *—S° F(B,) = —sF(0 = fO) (Bx). (5.10) 


2, k=1 j=0 k=1 


(Remember that the sum over j really has only finitely many terms.) Consider 
the innermost sum on the right-hand side. Since f(z) contains a factor z— By 
with multiplicity n+ 1 (or possibly more, as the 6; need not all be different), 
we have 

fF (B.) = 0 
for 7 <n+1. Next take j > n+ 1; in this case f(z) is a polynomial with 
integer coefficients, all divisible by j! and a fortiori by (n+ 1)!. Therefore 


t 


ys 
k=1 
is a symmetric polynomial in (1, 62,..., 24, the coefficients of the polynomial 


being divisible by (n+1)!; so, using the theorem on symmetric polynomials in 
the case R = (n+1)!Z, it can be written as a polynomial with coefficients divis- 
ible by (n+1)! in the elementary symmetric polynomials e, (31, G2,..., 84). But 
these elementary symmetric polynomials are the rational numbers +h;_,/h¢: 
that is, we can write 


t 

he-1 hte h 
Sf ( (Be) = pler,€2,--.,6¢) =p(-4, 4. a). 
hel he he he 


Since 
deg p < deg f\ < (deg f) —j < (n+ 1)t 


multiplying by pw 


side. Therefore 


will clear all the denominators from the right-hand 


nr FONG) 


k=1 
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is an integer divisible by (n + 1)!, and consequently so is 


co t 


port) ES S> f( Br) ) = AS P(e) 
k=1 


g=0 k=1 


The evaluation of F'(0) is comparatively straightforward: by standard argu- 
ments, we have 


0 ify<n 
fO(0) = Agta! ifj=n 
a multiple of (n + 1)! ifj>n. 
Combining (5.10) with all the divisibility results we have just proved, 
Are 7 = — shirt) prttn! + {a multiple of (n+ 1)!} . (5.11) 


It remains to estimate the integrals Jz, . Let H be the maximum absolute value 
of the coefficients of h (compare page 41). If z is a complex number lying on 
the line segment from 0 to 6;, then 


t t 
2) < dolryl le < HDT NBel < H(1 + |x)" 
j=0 j=0 
Moreover, for the same values of z, we have 
|ePa—-| = eRe(Pr—z) < ¢Re Br! : 
putting all this information together, 
Foul < [Bel [Bel "4? (1 + [Bal) PPM Re Fel 
Now (1, 62,...,6:,H and t are all independent of n, so we can write 
|I,| < ab” , 
where 
a = a(k) =|8,|(1+ Bel)’ HePe?*! and b= b(k) =|, (1+|Bel) A 


depend on k but not on n. One last estimate: if we let A be the greatest of 
the a(k) for 1 <k<tand B the greatest of the b(k), then 


t 
ea <r al le AB Ser cel?) 
k=1 


with c and d independent of n, and we have set up the customary contradic- 
tion. Choose n such that n+ 1 is greater than s, greater than the absolute 
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values of both ho and hy, prime, and sufficiently large that cd” < n!. Then 
from (5.11) we find that pees is a non~zero integer divisible by n!; using 
this observation and inequality (5.12), we have 


? 


ni < ares < cd” <n! 
which is a contradiction. Therefore, we have shown that 7 is transcendental. 


Corollary 5.8. The problem of squaring the circle is unsolvable. That is, it 
is impossible using ruler and compasses to construct two line segments with 
lengths in the ratio 7. 


Proof. As mentioned in exercise 3.23, it can be proved that segments with 
lengths in the ratio a can be constructed only if @ is an algebraic number 
whose degree is a power of 2. However, 7 is not such a number. 


5.3. SOME MORE IRRATIONALITY PROOFS 


Viewing the transcendence proof for 7 from a slightly different angle, we have 
assumed that a = im is algebraic, and have obtained a contradiction by show- 
ing that e* cannot equal the rational number —1. We can use similar ideas 
to prove that if @ is algebraic and non~—zero, then e® cannot equal any ra- 
tional number, or indeed any algebraic number: that is, e* is transcendental. 
We shall approach this difficult theorem slowly, beginning by taking a simple 
specific example, a = V2, and seeking only to prove the irrationality of e*. In 
the course of the proof we shall show that our careful estimates for h(z) and 
ek—* were not really necessary, and may be replaced by an argument based 
on simple properties of real or complex functions. 


Theorem 5.9. e¥? is irrational. 


Comment. In fact, we have already asked the reader to prove this result — 
see exercise 1.22. However, the method used there, while it also suffices to 
prove the irrationality of ev3, does not appear to generalise any further. The 
method we now introduce is much more powerful. 


Proof. Suppose, on the contrary, that ev? = p/q is rational. Inspired by (5.7), 
we consider not only V2 but also its conjugate —\/2, and begin by noting that 


(qev? — p)(qe~V? — p) =0, 
because the first factor is zero. Expanding and collecting terms, 
(p? + q°) — pq(ev? +e~¥?) = 0. 


Though we shall not use it in the present proof, we note that this equation 
can be rewritten a, e 
+ 
a tee V2 i eT, 
Pq 


- 
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which is strongly analogous to (5.8). As in the earlier proof, we make use 
of a polynomial with integral coefficients, having roots V2 and —V2: clearly 
h(z) = 2? — 2 is such a polynomial. So we set 


f(z) = Fld _ overt 


and then, closely following our previous proof, 


B 
I= [ f(e\e?* dz = Fe - FB) 

0 

with F=f+f'+f"+---, and 
J=Tlyt+ly.- 
Then we have 
pqJ = pqF (0) (eY? + eV?) - pq(F (v2) + F(-vV2)) 
= (p? +4’) F (0) — pa(F (v2) + F(—v2)) . 

Now f(z) has factors z + V2 with multiplicity n + 1, and so 

fI(V2) = FO (—v2) =0 
for any 7 <n+1.If j >n+1, then f(a) + f(x) is a symmetric 
polynomial in x; and 22, having coefficients divisible by (n+ 1)!; so it can be 
written as a polynomial with similar coefficients, evaluated at the elementary 
symmetric polynomials, 

fP (a1) + f (x2) = Pler,e2) - 


If we take x1 = V2 and x2 = =V/2, then e; and eg can be found in terms of 
the coefficients of h(z) and we have 


f (V2) + f (—v2) = P(O,-2) : 


this is a multiple of (n+ 1)!, and hence so is 


F(v3) + F(-v2) = S2(F (v3) + £(-V9) 


0 


&. 
ll 


Moreover, 
0 ify<n 
fP(0) = ¢ (—2)"+1n! ifj=n 
a multiple of (n + 1)! ifj>n; 


we can use these results to evaluate F'(0), and hence to obtain 


pqJ = (—2)"*1(p? + q?) n! + {a multiple of (n+1)!} . (5.13) 
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To estimate the integrals Ig, observe that e* and h(z) are continuous on the 
interval [-v2 V2 ] and hence are bounded there, say 


le*|<c, and |h(z)|<c 
whenever |z| < V2. So for 8 = +\/2, we have 


Za] < | [8l"egt ey ; 


therefore ; 


+ 

pat] < 2pqe, (coV2)" 
where c3 and cy are constants which do not depend on n. Now choose n such 
that n+ 1 is prime, is greater than 2 and p? + q?, and is large enough that 
n! > cgc?. Then, by the customary arguments, (5.13) shows that pqJ is a non— 


= C3Cq , (5.14) 


zero multiple of n!, and this contradicts (5.14). Therefore, eV? is irrational. 


This proof is rather more involved than the transcendence proof for 7, 
because of all the ps and qs. They arise as a consequence of dealing with an 
assumed root of the polynomial gz — p, whereas previously we were investi- 
gating a root of the much simpler polynomial z+ 1; we alleviated some of the 
complications by considering only one specific example. We shall do the same 
in introducing the next difficulty, and ask the reader to contribute by filling 
in routine details where indicated. 


Theorem 5.10. Exponential of a cubic irrational. If a is a (real or complex) 
root of the polynomial z° — 3z2 +5, then e® is irrational. 


Proof. Let a be as stated; write a1,a2,a3 for the conjugates of a; suppose 
that e* = p/q. We have 


0 = (ge“* — p)(qe*? — p)(ge** — p) 
=_ —p® + p*q(e”! + e2 + eP3) — pq? (e?4 + Ps + es) + gre’ : (5.15) 


where 
Br=a1, B2=a2, B3=a3, 
Ba=ar+ag, Bs=ag+az, Be =agt+ay, 
by =a, +02 +03 =38. 
Now let 


h(z) = (z — Bi)(z — Ba) +++ (2 — Br) 
= (z — 3)(z3 — 327 + 5)(z8 — 622 + 9z — 5) ; (5.16) 


take f(z) = 2"h(z)"*' and 


B 
Ig =| f(z)e?~* dz = F(0) e® — F(B) , (5.17) 


124 @ Irrationality and Transcendence in Number Theory 


where F = f + f’ +f’ +---. To make effective use of (5.15) we must vary J 
somewhat from our earlier definition by taking 


J =p'q(Ip, + Ip, + Ia5) — pa” (Ia, + Ips + Loe) + 2° I pr 5 (5.18) 
using (5.17) we can then show that 


J = pF (0) — p?q(F(A1) + F(S2) + F(S3)) 
+ pq’ (F (Ba) + F(85) + F(B6)) — a F (Bz) - (5.19) 


We wish to use properties of symmetric polynomials to prove that the sum 
of all the terms on the right-hand side, except for the first, is a multiple 
of (n + 1)!. Because of the differing coefficients p?q, pg? and q?, permuting 
Bi, P2,..-, 27 does not always leave this expression unchanged, and the sym- 
metry obtaining in our previous arguments has been damaged; but enough 
remains for us to complete the proof. First, we note that if 7 <nm+1, then 
f (Bx) = 0 for every k. Now let 7 >n+1. Then 


f (x1) + fF (x2) + f (xs) (5.20) 


is a symmetric polynomial in 71, 22,73 whose coefficients are multiples of j!, 
and hence of (n + 1)!; therefore it can be written as a polynomial P, whose 
coefficients are also multiples of (n+1)! , in the elementary symmetric functions 
€1, €2, €3. First, we let x1, 22,23 be 31, G2, 83, the roots of 23 — 3z2 +5; then 
the values of €1, €2, e3 are the coefficients of this polynomial, and we have 


f (Br) + FP (Bo) + f (Bs) = P(e1, e2,e3) = P(3,0, -5) 


which is a multiple of (n+1)!. Since 84, 85, 86 are the roots of z3—6z7+9z—5 
we have similarly 


f (Ba) + fF (Bs) + FP (Be) = P(6,9,5) , 
also a multiple of (n + 1)!. Finally, 
f (Br) = f%(3) 


is a multiple of (n + 1)!. Putting all these results together, and evaluating 
F'(0) separately, we can show that 


J = 75"t'pn! + {a multiple of (n+ 1)!}. (5.21) 


Let R be a real number greater than the absolute values of all the 6,. Then 
e* and h(z) are bounded on |z| < R, and so we can estimate the integrals 
appearing in (5.18) to obtain a bound 


|J| < 7p|7|q/2 RR cette, < cgch . (5.22) 
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If n is chosen suitably, then (5.21) and (5.22) are incompatible; this is a 
contradiction, and so e® is irrational. 


In exercise 5.5 we ask the reader to fill in the details which have been 
omitted from this proof. 


The additional difficulty referred to in the preamble to Theorem 5.10 is that 
the expansion (5.15) forces us to consider not only the conjugates a1, a2, a3 
of the given exponent a but also sums of these conjugates. The success of 
the proof depended upon two important properties of these sums. Firstly, the 
expressions fall into three sets 


{ 61, 82,83}, {84,85,86}, {Br}, 


and each of these sets consists of the roots of some integer polynomial. That 
is, each is the set of conjugates of some algebraic number. Secondly, for any 
two ( in the same set, the corresponding terms e? in (5.15) share a common 
coefficient. This enables us to factor out the coefficient in (5.18) and evaluate 
J in terms of expressions such as (5.20). Both of these features will be of 
crucial importance in future proofs. Before turning to the climactic result of 
this chapter, we give one further proof involving a specific number. 


Theorem 5.11. eY? és transcendental. 


Proof. Suppose, to the contrary, that eV? is an algebraic number of degree 
m having minimal polynomial 


p(z) = Cm2™ + em_12z" | +++ Fez + cp (5.23) 
over Z, with co and c,, non-zero. Then 
_ 


VE ei, 


p(eY*) p(e~ 


Consider the product on the left-hand side, 
(co + ge”? jie 4 eqe”) (co + Ge Fain Gua?) 


For every term cje/V?c,e~V? with j 4 k there is another term cye*V2c;e4V?; 
therefore the terms e-*)¥? and e~V-*)V? have the same coefficient and we 
may write 


0O=agt+a, (ev? + eV?) tere + dm (e™¥? + ga) (5.24) 
with ao, @1,42,...,Q@m € Z, and specifically 


ag=at+d+at+t--+2, #0. 
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Observe that (5.24) has two properties which have been commented upon 
above: the sum involves pairs of numbers (,—(, which are conjugates; and 
the terms e*, e~? share a common coefficient. The actual values of a1,...,@m 
will turn out to be unimportant; so we have not sought to particularise these 
coefficients by giving specific formulae, as we did in (5.15). However, it is 
important — indeed, vital — that the integer term ag is not zero; this will not 
be “automatically” true in future proofs, and will require further argument. 

The remainder of the proof will follow familiar lines: once again we shall 
go through it quite lightly, and invite the reader to fill in details. As in (5.16), 
we define a monic polynomial with integer coefficients, whose roots are the 
non—zero exponents in (5.24), 


h(z) = (z? — 2)(2? — 8)(2? — 18)--+ (2? — 2m?) ; 


and for any positive integer n we set 
f2i=S ney. 


We shall employ the integrals 


which are evaluated as 
Ig = F(0)e® — F(B8) with F(z) = f(z)+ f'(@) + f"(2)+-- 


Consider 


= —aoF (0) — S) a y (f (kv2) + f (-kv2)) . 


For any @ in E = { /2,—-v2,...,mV/2, —mv2 }, the polynomial f(z) has a 
factor (z — B)"+!, and so f‘)(8) = 0 whenever j < n. If j > n+1, then 
f(z) is a polynomial whose coefficients are integers divisible by (n + 1)!. So 


f (21) + f(z) 


is a symmetric polynomial in two variables, having coefficients in (n + 1)! Z; 
and kvV2, —k,/2 are the roots of a monic polynomial with integer coefficients; 
so by Corollary 5.6 on the evaluation of symmetric polynomials, 


fo (kv/2) + f0 (—kv2) 
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is a multiple of (n + 1)!. Evaluating F'(0) along the lines of previous proofs 
holds no surprises, and we obtain 


J = —aoh(0)"*'n! + {a multiple of (n+ 1)!}. 


Now let c; be the maximum absolute value of a1,a2,...,dm; set R = mv2, 
so that |G] < R for all 6 € E. Since h(z) is continuous for all z, there exists co 
such that |h(z)| < ce whenever |z| < R; and for similar reasons, the functions 
e°-* with 8 € E have a common upper bound c3. Then we have the estimates 


Ig < R- R° cht eg 
for each 8 € EF, and 
|J| < 2me1(coR)"*1c3 = cack . (5.25) 
Finally, choose n such that 


nl! >cack and n+1lis prime and n+1> |aol, |A(0)| . 


Then J is not a multiple of (n+ 1)!, so J #0; and J is a multiple of n!, so 
|J| >n! > cack , 


which contradicts (5.25). Therefore, eY? is transcendental. 


5.4 TRANSCENDENCE OF e® 


What Lindemann actually proved in 1882 was the following, which has the 
transcendence of 7 as an immediate consequence. 


Theorem 5.12. Lindemann’s Theorem. [f a is a non-zero algebraic number, 
then e® is transcendental. 


It is easy to see that this result supersedes all the previous results of this 
chapter. The proof, however, is a good deal more involved, and so we trust 
that the reader will not begrudge the time spent on earlier proofs, which 
have served to introduce many of the fundamental ideas that we shall need. 
There are still a few issues that we have not yet dealt with, and an informal 
presentation is likely to be of value; on the other hand, the reader will, no 
doubt, wish to see a detailed and rigorous proof; we shall give both. 


So, let a be algebraic and non—zero; we seek to show that e™ is transcenden- 
tal. We may assume without loss of generality (exercise!) that a is an algebraic 
integer, and we denote its conjugate algebraic integers by a1, a2,...,a1, with 
ay=a. 


128 @ Irrationality and Transcendence in Number Theory 


Now suppose that e® is an algebraic number of degree m and has minimal 
polynomial 
plz) = em2™ + Gniig™ b+ tee +e (5.26) 


with integer coefficients, c,, and co being non-zero. Then 


p(e*) p(e%) «--p(e%) =0, (5.27) 


because the first factor is zero. We may use (5.26) to expand the left-hand 
side of (5.27), giving a sum with integer coefficients which, for the time being, 
we write “schematically” as 


S- {some coefficient} e° ; (5.28) 
BEE 


our first concern will be to examine this sum more closely. The collection EF 
of exponents consists of all sums of the form 


B=210Q, + 27202 +--+ 2/0 


with each x; in {0,1,...,m }; note that not all (m+1)! terms in the collection 
need be distinct?. Our aim will be to define certain integrals Ig and a sum 
something like 

J= pa {some coefficient} Ig , (5.29) 

BCE 

— exact details later — and to evaluate the sum in terms of certain function 
values F'(3). The procedure should be familiar from earlier proofs. By referring 
back to these arguments, we see that in general, an individual term F'(3) 
could be almost anything, and will give us little information; the way to make 
progress will be to evaluate sums of the form 


B(By) + PU Ga) Po + PBR) (5.30) 


where the 6; are conjugates. To do this by means of symmetry arguments, 
two things are essential. 
e The §4, G2,..., 0, in (5.30) must comprise all conjugates of some fixed 
algebraic number. For example, if 61, 82, 3 are conjugates, there is not 
much we can say about an expression such as F'((1) + F'(32). 


e The coefficients of terms in (5.28) with conjugate exponents must be 
the same, so that when evaluating (5.29) we can factor out the common 
coefficient and leave a “pure” sum such as (5.30). 


In terms of the collection EF, this comes down to two requirements. 


e If any algebraic number occurs in F, then all its conjugates must occur 
too, and with the same multiplicity. 


?This is why we say “collection” rather than “set”, a term which normally indicates that 
repetitions are to be disregarded. 


Hermite’s Method for Transcendence @ 129 


e If G, and By are conjugate elements of E, then the coefficients of e?1 
and e”? in (5.28) must be the same. 


These considerations lead to the following definitions. 

Definition 5.2. A complete collection of conjugates is a finite collection 
B of algebraic numbers such that if B is in B and 8 is an algebraic conjugate 
of 8, then 6 is also in B, and occurs with the same multiplicity as B. A 


complete set of conjugates is a set consisting of all the conjugates of some 
algebraic number (occurring once each). 


Examples. The following are complete sets of conjugates, where we write 
C = e2ri/3; 
e { V2, —V2}; 
© { V5, V5C, VEC? }. 
The following are complete collections of conjugates: 
e { V2, —V2}; 
e { V2, -Vv2, V5, V5C, W507}; 
e { v2, -v2, V2, -v2, V2, -V2, V5, W56, W507}. 
The following are not complete collections of conjugates: 
e { V2, —V2, V5, W5C}, because V5 C? is missing; 
e { V2, -V2, V2, V2, V5, W5¢, W5¢? }, since V2 and — V2 do not occur 


the same number of times. 


Comment. It is clear that a complete collection of conjugates can be parti- 
tioned into subsets which are complete sets of conjugates (and which need not 
all be distinct). 


The following result gives a useful way of identifying complete collections of 


conjugates. 


Lemma 5.13. A completeness criterion. A collection B of algebraic numbers 
is a complete collection of conjugates if and only if the polynomial 


Q(z) = [[ @- 8) (5.31) 
BEB 
has rational coefficients. Moreover, if the elements of B are algebraic integers, 


then Q(z) has integer coefficients. 


Proof. Let B be a complete collection of conjugates. Partition B into complete 
sets of conjugates B,, Bo,..., By; each of these consists of the conjugates of 
some algebraic number 9, occurring once each. Then 


I] &-4 


BEB; 
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is the minimal polynomial of (9 and therefore has rational coefficients; and 
Q(z) is the product of these k rational polynomials. Moreover, if B consists of 
algebraic integers, then each minimal polynomial has integer coefficients, and 
so does Q(z). 

Conversely, suppose that the polynomial Q(z) in (5.31) has rational coef- 
ficients, and factorise it into powers of distinct (rational) irreducible polyno- 
mials, 

Q(z) = Qi(z)*Qa(z)*? --- Qilz)* . 
Then any @ in B is a root of exactly one Q;(z); any conjugate B of is a root 
of the same Q;(z); and 8, 8 both occur s; times in B. So B is a complete 
collection of conjugates. 


We shall shortly use this lemma to show that certain subsets of EF are 
complete collections of conjugates. Now we address the issue of the coefficients. 
The expansion of (5.27) gives a sum of terms 


Co, Catg ** SS Dee ad oe 
over all [-tuples x in X = {0,1,...,m}!. If x is in X and y has the same 
components as x but in a different order, then y is also in X; therefore we 
can partition X into subsets, each consisting of all possible vectors with a 
given collection of entries, and this induces a partition of the collection E of 
exponents in (5.28). It is clear that if x and y belong to the same subset of 
X, then 
Cay Cag ©" * Ca, = Cy Cyn ** Cyr » 


and so (5.28) can be written as a sum of sums having the form 


dy >) es (5.32) 


BED, 


where each dy is a product of certain coefficients from the polynomial (5.26), 
and the D;, form the partition of E referred to above. We shall show that each 
Dy, is a complete collection of conjugates; therefore the sums (5.32) can be 
split into sums over complete sets of conjugates E;,. It is possible that when 
this is done, the same complete set of conjugates may occur more than once; 
but if we collect terms with the same E;, we find that (5.28) becomes a sum 


of sums 
ak a eB 
BEER 
in which each E;, is a complete set of conjugates, and no two Ex are the 
same. We note that the coefficient d;, of each sum in (5.32) is an integer; after 
collecting terms, the coefficients a, will also be integers. 


Example. To illustrate the transformation of the product (5.27) into a sum 
over complete sets of conjugates, consider the case 


a, = J24+ V3, a =/2—-V3, a3 = -V2+ V3, a4 = —V2-— V3 
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simplified 8 


2\/2, 2/3, 0, 0, 


chime | 
Gi [1 Latetaral 0 | 


Table 5.1 Sums of conjugates of 2+ V3. 


(1,1,0,0) ete ay + ag etc 


(so that | = 4); and take m = 1 (so that we are assuming e® is rational). 
We tabulate the quadruples x in X = {0,1}4, the size of the subset of X 
containing x; the corresponding linear combinations 8; and simplified forms 
of 8. The following points are worth noting. 


e The @ values in lines 2 and 4 are identical, even though they come from 
different quadruples; the same holds for lines 1 and 5. 


e The values in lines 1,2,4 and 5 form complete sets of conjugates. 


e The six values in line 3 form a complete collection of conjugates which 
splits into four complete sets of conjugates 


two of these are the same and coincide with the sets in other lines. 
Thus, if we write 
Eo ={0}, Fi ={ a1, 02,03, 04 }, 
7p) = { 2/2, -2V2} ’ Es = { 2V3, -2V3} ) 
then the expansion of (5.27) takes the form 
ag Soe ta S> Ft ap S > e® +43 Soe? (5.33) 
BEEo BEE, BEE2 BEE: 


for certain integer coefficients ao, a), @2,a3. In this example, it is not hard to 
bash out the algebra and obtain the explicit expansion 


(ef + 2eke? + cf) + (ex + coc? )(e™ +e? + 67 + e™) 
+ cet (e2¥2 a e2V2) + 2 (e2v3 + e2v3) : 
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But the point is that in more complex cases, this computation would be infea- 
sible; so we need to omit the calculation and, instead, understand the general 
shape of the expression we would have obtained. 


As foreshadowed above, we prove a result which will show that the sets 
Dy, in (5.32) are complete collections of conjugates. 


Lemma 5.14. Linear combinations of conjugates. Let 11,x2,...,x%, be in- 
tegers, not necessarily distinct. Let X be the set of all ordered |-tuples x 
whose elements are £1,%2,...,21, though not necessarily in that order. Let 
Q4,Q2,...,a, be a complete collection of conjugates. Then 


Ex ={x01 + G02 +-+++ x10, |x € X } 
is a complete collection of conjugates. 


Proof. Consider 


Qz)= [[ @-#) : 


BEEXx 


we wish to show that Q(z) is unchanged by any permutation of the a;. Since 
any permutation can be composed of successive transpositions of two elements, 
it suffices to show that Ex, the collection of roots of Q, is unchanged by any 
transposition of two a;. Suppose that we interchange a; and a;. We can split 
X into 


e a number of individual /-tuples x in which x; = 2;; 


e a number of pairs of vectors 


with x; A x;. 
The corresponding elements of Ex are 


e expressions 
98 Belg be sb Bg Oly apt 
which, because 7; = xj, are not altered when a; and a; are swapped; 


e pairs of expressions 
ttt Eg +e + XjAG ++: and tee BiG Fes + LAG eee 5 


when a; and a; are swapped, each of these expressions becomes the 
other, so the pair as a whole is unaltered. 


Thus Fx is unchanged by any transposition, and therefore by any permuta- 
tion, of the a;. Now Q(z) has coefficients which are elementary symmetric 
polynomials in the elements 8 of Fx, and hence are polynomials with integer 
coefficients in the a;. The argument just given shows that these coefficients are 
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in fact symmetric polynomials in the a;; but as the a; form a complete collec- 
tion of conjugates they are the roots of a polynomial with rational coefficients, 
and so by Corollary 5.6 the coefficients of Q are rational. This completes the 
proof that Ex is a complete collection of conjugates. 


We return to our expansion of (5.27) as a sum of sums over complete sets 
of conjugates; readers may consider (5.33) as an exemplar. If Ey = {0}, as in 
this example, we recall from earlier proofs that part of the argument involves 
showing that the sum J, which will be something like (5.29), is given by an 
expression of the form 


J = —aoh(0)"*'n! + {a multiple of (n+ 1)!} ; 


we then choose n such that, among other requirements, n+ 1 is a prime which 
is not a factor of ag. However, if it were to happen that ap = 0, this would be 
impossible, and so we would like to rule out this mischance. Unfortunately, 
in general we can’t do so: there are cases in which ao turns out to be zero 
(see, for example, exercise 5.8). We’ll then need to modify J in order to make 
use of a different non-zero coefficient instead of ag. There are two issues here: 
firstly, can we be certain that there is a non—zero coefficient in our expansion 
at all? and secondly, how can we use a non-zero coefficient (other than ag) to 
complete the proof? 


The first question is not difficult to answer in the affirmative. As a first 
attempt, we consider a product 


(sye7! +--+ + 5,e7")(tye™ + +--+ te”) (5.34) 


in which all the o; are distinct, all 7; are distinct, all coefficients s; and t; are 
non~zero, and all o; and 7; are real. By reordering the sums if necessary, we 
may assume that the smallest of the a; is 7; and the smallest of the 7; is 7,. 
Then the expansion of (5.34) contains a non-zero term 


(a je" s (5.35) 


any other term in the expansion has exponent larger than 0; + 7,. Therefore, 
when terms having the same exponent are collected, (5.35) remains alone, is 
not cancelled by any other term, and has a non~zero coefficient. 


The defect of the argument just given is that it only deals with real expo- 
nents; the exponents a; that we want for our main proof may well be complex. 
It turns out that there is less of a difficulty here than might be expected: all 
we need do is to define what we mean by saying that one complex number is 
“smaller” than another. This may sound like a doubtful procedure, since it is 
well known that “it is impossible to order complex numbers”; but this is only 
true if we want an order which is fully compatible with complex arithmetic, 
and here we can be satisfied with much less. 
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Lemma 5.15. Products of exponential sums. If a product 
(sie +--+ + 8,e7*) (te + +--+ te”) 


in which all the oj are distinct complex numbers, all tT; are distinct complex 
numbers, and all coefficients s; and t; are non-zero, is expanded and terms 
with the same exponent are collected, then there will be at least one term with 
a non-zero coefficient. 


Proof. Define the lexicographic order on the set of complex numbers: z1 ~ z2 
means 


Re(z1) < Re(zz) or Re(z,) = Re(z2) and Im(z1) < Im(z2) . 


Without loss of generality we may assume that o; is the smallest of the a; with 
respect to this ordering, and 7, is the smallest of the 7;. Then the expansion 
of the product contains a non-zero term 


(sitet . (5.36) 


Any other term in the expansion has exponent o, + Tq with either 71 < op 
or 7, < Tq; this exponent is greater than, and therefore not equal to a1 + 71. 
Hence, when terms having equal exponent are collected, (5.36) remains by 
itself and has a non~zero coefficient. 


Comment. In the above proof we have implicitly used certain “obvious” 
properties of the lexicographic order which are stated more carefully in the 
appendix. 


Let’s review where we are up to. We expand the product (5.27) and collect 
terms having the same exponent to obtain a sum of the form (5.28), and we 
know that at least one of the coefficients in this sum is non-zero. Since we 
have collected terms, no exponent in this sum occurs more than once; we have 
also shown that terms with conjugate exponents have the same coefficient; 
and we also know that the exponents can be partitioned into complete sets of 
conjugates. This means that we have an expression 


ao Se +a, SY) e® +--- +a, Soe? =0 (5.37) 

BEE BEE, BcEs 
in which every FE; is a complete set of conjugates, no two E; are the same, and 
every coefficient a, is a non-zero integer. If it should happen that Eo = {0}, 
then we proceed with the kind of argument that we have seen already; but 
this need not be the case. To obtain a non-zero integer term, we multiply 


(5.37) by 
Dee ay (5.38) 
Ee Eo 
We need to ensure that the resulting sum can still be partitioned into sums 
over complete sets of conjugates, and that it has a non-zero integer term. 
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Lemma 5.16. Subtracting complete collections of conjugates. Suppose that 
B and C are complete collections of conjugates, and write 


B-C={P=7|PeB, vec}. 
Then B —C is also a complete collection of conjugates. 


Proof. Since B and C are complete collections of conjugates, 


Pa(z)= [](@-8) and Po(z)= [[@-% 


BEB yEC 


are both polynomials with rational coefficients. Now we can write 


the coefficients of Q are polynomials with rational coefficients in 71, 72,---, 7%; 
the elements of C’. Permuting these elements does not alter Q, so the coef- 
ficients are symmetric polynomials in the y,. Since the y, are the roots of 
Po, which has rational coefficients, Corollary 5.6 shows that Q has rational 
coefficients. Thus B — C is a complete collection of conjugates, as claimed. 


Now multiply (5.37) by (5.38). We obtain a sum of terms of the form 


ar >> So a = ag ef . 


BEE yEEo BEE,—Eo 


The lemma guarantees that the right-hand side is a sum over a complete 
collection of conjugates, and therefore, as we have seen earlier, we can split 
it into sums over complete sets of conjugates. Moreover, if k # 0, then Ex, 
and Eo are distinct complete sets of conjugates and are therefore disjoint, so 
none of the terms on the right-hand side has zero exponent. The sum over 
B € Eo — Ep is split into complete sets of conjugates in the same way; in 
this case exactly |E| of the exponents will be 8 — 6 = 0. Hence we have an 
expression 


(a0 i OL eS \( 7) 


BEEo Bek, BeEs ye Eo 
= aq|Eo| +01 55 e? +---+a, So ef = ; 
BEE, BEE 


where, in order not to run out of the letters of the alphabet, we have re-used 
notation: the sets E;, in the second line need not be the same as the previous 
E., but they are still complete sets of conjugates; and the coefficients a; need 
not be the same as the previous az, but they are still integers. Moreover, 
ao # 0, and none of the E;, contains zero. 


We have now overcome all the difficulties involved in our generalisation of 
previous results, and are ready to give a proper proof of the desired theorem. 
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Theorem 5.12. Lindemann’s Theorem [38]. If a is a non-zero algebraic num- 
ber, then e® is transcendental. 


Proof. Let a be a non-zero algebraic number; without loss of generality, a is 
an algebraic integer. Let the conjugates of a be a1, Q2,...,a1, with ay =a. 
Suppose that e® is an algebraic number of degree m having minimal polyno- 
mial 

p(z) = Cre” =F teas +++ +C12 + Cg 


with integer coefficients, where c,,, and cg are non-zero. We have 
p(e**) p(e*) ---p(e) =0, 


because the first factor is zero. Writing the left-hand side as a product of 
factors 
co + cp eO® + cp €720® fee + Ee (5.39) 


and expanding yields a sum S$ of terms e? times integer coefficients. If we 
write 


X ={0,1,2,...,m}! 


for the set of all tuples x = (a1, v2,...,2,) of integers from {0,1,2,...,m}, 
then the exponents 3 appearing in the sum are the elements of the collection 


E={aj01 + 4202 +---+ a0; |xEX}; 


this collection consists of (m+ 1)! algebraic integers, not all of which need be 
distinct. Now partition X into subsets, each consisting of all possible /-tuples 
with a particular collection of entries, and consider the exponents 8 € E cor- 
responding to one particular subset Xo. Since { a1, a2,...,a, } is a complete 
set of conjugates, the “linear combinations of conjugates” lemma shows that 
these exponents form a complete collection of conjugates, which can be split 
into one or more complete sets of conjugates. Also, the coefficient of e° is 


Ca Cag*** Cay 5 


and this is the same for all x € Xo. Therefore, S can be written as a sum of 
sums 
ae 
B 


where in each sum the values of 6 range over a complete set of conjugates, and 
each coefficient is an integer. If any two complete sets of conjugates appearing 
in the sum are the same, we can collect the corresponding terms; thus, we have 
an expression for S in which no complete set of conjugates appears more than 
once. Since this expression was obtained by expanding a product of terms like 
(5.39) and then collecting terms with the same exponent, the “products of 
exponential sums” lemma, Lemma 5.15, extended inductively to a product of 
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1 factors, guarantees that at least one of the coefficients is non-zero. Choose 
a term 


with a9 4 0, and multiply S by 
pa aur (5.40) 
ye Eo 


Thanks to the “subtracting complete collections of conjugates” lemma, the 
new sum still consists of a sum of sums of exponentials over complete collec- 
tions of conjugates, times integer coefficients; these may once again be split 
into complete sets of conjugates. The product 


ee 


BEEo yEEo 


contains exactly |Eo| terms e®—° = 1; to put it another way, when Ey — Ep is 
decomposed into complete sets of conjugates, {0} occurs exactly |Epo| times. 
Any other product 


S- ey ete S- bee ae 
BEE, ye Eo BEER yEEo 


since Ey and Ep are disjoint, contains no zero exponents. Thus, finally, mul- 
tiplying S by (5.40) gives an expression 


ao|Eo| tar So ec? +++» +a, So e® =0, (5.41) 
Bek, Bek: 
where the coefficients a, are integers with ag9 4 0, and EF), E,...,E; are 


complete sets of conjugates not containing zero. 


The argument which deduces a contradiction from (5.41) will not differ 
greatly from earlier proofs in this chapter. For any positive integer n, define 


h2=J][ [[@-5), fe) = 2a) 


k=1 BEE, 


and . 
ig= [fede 
0 


where the path of integration is the straight line from 0 to @ in the complex 
plane. We note that the roots of h are algebraic integers forming a complete 
collection of conjugates, and so h(z) has integer coefficients; moreover, none 
of the roots is zero, so h(0) 4 0. We can evaluate the integrals as 
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where 
F(z) = fle) + f'(2)+f"@+ 
Now let P 
= Yo (a 3 Ip) (5.42) 
k=1 BEE 
we have 


k= 
= ~ao|Eo|F(0) — ) fax >) Dd) FO (8) , (5.43) 


noting that the sum over j is really a finite sum, and so there are no conver- 
gence issues. Now 


f (0) = 9! x { coefficient of z/ in f(z) } 
0 ifj<n 
=< A(0)"t!n! ifj=n 
a multiple of (n+ 1)! ifj >n, 
and so 
F(0) = h(0)"*!n! + {a multiple of (n+ 1)!} . 


We need to evaluate the sum involving terms f‘) (8). If 8 is any element of 
the union of the Ex, then (z— 8)"*! is a factor of f(z) and so f) (8) = 0 for 
j <n. Ifj >n+1, then f(z) is a polynomial whose coefficients are integral 
multiples of (n + 1)!, and any expression 


f (a1) 4 f (z2) Ate dss f (zs) 


is a symmetric polynomial in its variables, whose coefficients again are multi- 
ples of (n+ 1)!. So for each k, the sum 


SF (8) (5.44) 


BEER 


is a polynomial with coefficients multiples of (n + 1)!, eveluated at the ele- 
mentary symmetric functions of the algebraic integers 6 € E,. Since Ey is 
a complete set of conjugates consisting of algebraic integers, each elementary 
symmetric polynomial is a rational integer, and the sum (5.44) is a multiple 
of (n+ 1)!. Substituting all these details back into (5.43) yields 


J = —ao|Eo|h(0)"t'n! + {a multiple of (n+1)!} . (5.45) 
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Now we estimate the size of J. The definition of J is a sum of 
T = |Bi| +|Bo| +---+|B;| 


integrals; T here is a fixed finite number, not depending on n. The sum also 
involves coefficients a1, @2,..., a1; let co be the greatest absolute value of these 
coefficients®. Let R be a real number greater than the absolute value of each 
of the 6. The polynomial h(z) is analytic everywhere and therefore bounded 
on |z| < R, say |h(z)| < c. The finitely many functions e?~* are also analytic 
everywhere and bounded on |z| < R; let cz be a common bound for all these 
functions. Therefore 
\I,| < RB a es 


and 
[J] < TepR™t ct eg = exch . (5.46) 


Now choose n such that n! > cgci}, and such that n+ 1 is a prime greater 
than |ao| and || and |h(0)|. Then (5.45) shows that |.J| is an integer which 
is not a multiple of (n + 1)! and is therefore not zero; but which is a multiple 
of n! and therefore not less than n!; this contradicts (5.46) and completes the 
proof. 


5.5 “OTHER RESULTS 


The result of the previous section can be rephrased... 


Theorem. Let a, and az be unequal algebraic numbers, and let 3; and Bz be 
algebraic numbers. If B,e%! + Bge°2 = 0, then 8, = Bo = 0. 


...and rephrased again... 


Theorem. Let a; and a2 be unequal algebraic numbers. Then e™ and e° 
are linearly independent over the field of algebraic numbers. 


...and generalised, giving the extension of Lindemann’s result proved by 
Weierstrass [67] in 1885. 


Theorem 5.17. The Lindemann—Weierstrass Theorem. Let a1,@2,...,az be 
unequal algebraic numbers. If 31, 82,..., 8, are algebraic numbers and 

Bye! + Boe? +---+ Bee™* =0, (5.47) 
then By = Bg =--- = 2, = 0. That is, e,e°?,...,e% are linearly independent 


over the field of algebraic numbers. 


’This and subsequent constants need not, naturally, be the coefficients cz, of p(z). But 
we have (almost if not quite) run out of letters! 
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Although the details are yet more complicated, the ideas behind the proof 
of this theorem are no different from those we have already seen. We take 
(5.47), multiply by similar expressions involving all possible conjugates of the 
a, and 8,, and show that we obtain a sum like (5.37). Once again it may 
be necessary to multiply by an expression like (5.38) in order to ensure that 
our sum includes a non~zero integer term. We then define a combination of 
integrals similar to J in (5.42), and obtain contradictory estimates for its 
divisibility properties and its size. 


We conclude this chapter by mentioning two other results in a similar 
spirit to those listed above. We have already considered results concerning 
the transcendence of the exponential e* when a is algebraic; now we let a 
and 3 be algebraic and ask whether a is algebraic or transcendental. This 
was the seventh on the list of twenty three problems proposed by David Hilbert 
in 1900 as a challenge to twentieth—century mathematicians. 


It is obvious that a” is rational if a is 0 or 1; and if @ is rational (and a alge- 
braic), then a is also algebraic. So we may disregard these cases and consider 
only a 40,1 and 8 ¢ Q. Hilbert, apparently, expected the seventh problem to 
be one of the most difficult on his list; nevertheless, it was among the first to 
be solved. Kuzmin, in 1930, showed that a? is transcendental in the case that 
B is a quadratic irrational; in 1934, A.O. Gelfond and T. Schneider, working 
independently of each other, completely settled the problem by proving the 
following result, and it has become customary to give them equal credit for 
its discovery. 


Theorem 5.18. The Gelfond—Schneider Theorem. Let a be an algebraic num- 
ber, not 0 or 1, and 8 an algebraic irrational. Then a® is transcendental. 


Note that 8 may be complex, in which case the power a? assumes many 
values; the theorem asserts that all of these values are transcendental. The 
Gelfond-Schneider Theorem has many interesting and simple corollaries. 


Corollary 5.19. The real number e” is transcendental. 
Proof: e” is one of the values of i~”*. 


Corollary 5.20. Transcendence of logarithms. If a and y are algebraic num- 
bers with a #0,1 and y 40, then logy /loga is either rational or transcen- 
dental. 


Proof. If 6 = logy /loga, then a® = y, and so 6 cannot be an algebraic 
irrational. 


In particular, log 2/log 3, which was of great importance in the musical ques- 
tions of section 4.4.2, is a transcendental number. 
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In 1966 the British mathematician Alan Baker proved a series of results 
which generalise the Gelfond—Schneider Theorem. Baker’s main results are 
generally expressed in terms of the logarithms of algebraic numbers, but in 
the interests of consistency with the remainder of this chapter we cite two 
corollaries which can be conveniently written in terms of exponentials. Proofs 
may be found in [9], Chapter 2. 


Theorem 5.21. If aj,...,Qn, 80, 1,---, Bn are non-zero algebraic numbers, 


then 


Bo pP1 ... (Bn 
eat ath 


is transcendental. 


Theorem 5.22. If a,,...,Qn are algebraic numbers, 
not 0 or 1, and if B,,..., Bn are algebraic numbers such 
that B1,..., By and 1 are linearly independent over the 
field of rational numbers, then 


of? eee abn 
is transcendental. 
Alan Baker was awarded the Fields Medal in 1970. pce 
EXERCISES 
5.1 Prove that if a rectangle is inscribed in a circle, then the ratio a of the 


5.2 


5.3 


5.4 


areas of circle and rectangle, and the ratio @ of the perimeters of circle 
and rectangle, cannot both be algebraic. (From [50].) 


Show that the polynomial 


(a? + 22 — 02 — 22)? — A(ay re + 2324)" 


is symmetric, and write it as a polynomial in the elementary symmetric 
polynomials. 


Prove that if a,,...,@,, are positive constants, then 


n 


eee 
1l+a,z (1+a,)---(14+ an) 


3 


k=1 
where ex is the kth elementary symmetric polynomial in aj,..., dn. 


State and prove a generalisation of exercise 3.16 involving symmetric 
polynomials. 
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5.5 


5.6 


5.7 


5.8 


5.9 


5.10 


5.11 


Complete the proof of Theorem 5.10 by filling in the following details. 
(a) Confirm the equalities (5.16), (5.17), (5.19) and (5.21). 
(b) Check that none of the §;, is zero. 
(c) Explain why if j <n +1, then f%(6,) =0. 
(d) Show how estimates for the integrals Ig, lead to the bound (5.22). 
(e) Explain how n should be chosen in order to obtain a contradiction. 
For a = 2+ ¥3, compile a (partial) table of complete collections of 


conjugates as in the example on page 130, in the case m = 1. The 
minimal polynomial for a was found in exercise 3.1. 


(a) What are the conjugates of a? 


(b) Find all 15 linear combinations of the a; corresponding to vectors 
x which are permutations of (1,1,0,0,0,0), and partition these 
linear combinations into complete sets of conjugates. Some of the 
factorisation and irreducibility testing is rather involved: you may 
wish to abandon the methods of section 3.1.1 and use computer 
assistance instead. 


(c) Do the same for the 20 permutations of (1,1, 1,0,0,0). 
Prove that if e* is transcendental whenever a is a non—zero algebraic 
integer, then e% is transcendental whenever a is a non—zero algebraic 


number. (This justifies the “without loss of generality” at the beginning 
of the proof of Theorem 5.12.) 


Let a be the real cube root of 2, and let a1, a2, a3 be its conjugates; let 
p(z) = 1-2. Show that if we expand 


p(e™)p(e*?) p(e**) 


and collect terms having the same exponent, then the integer term is 
Zero. 


Prove that the following are equivalent: 
(1) if @ is a non-zero algebraic number, then e® is transcendental; 


(2) if ay and ag are unequal algebraic numbers, then e® and e°? are 
linearly independent over the field of algebraic numbers. 


Prove that if a is a non-zero algebraic number, then cosa and sina and 
tana are transcendental. 


Show that there is a unique real c in the interval (0,1) such that 


1 lo ) 
ce dr = ) ce”. 
0 n=1 


Prove that both c and logc are transcendental. 


5.12 


5.13 


5.14 
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The equation x” = 1+ 2 has a unique real positive solution a. Show 
that a is transcendental. 


Use the Gelfond—Schneider Theorem to prove that if @ is a positive 
algebraic number with a 4 1, then cos(loga) and sin(log a) are tran- 
scendental. 


In Hilbert’s statement of his seventh problem, before getting to what is 
now known as the Gelfond—Schneider Theorem, he gave two versions of 
a related question. 


(1) In an isosceles triangle, if the ratio of the base angle to the vertex 
angle is algebraic and irrational, then the ratio of the base length 
to the side length is transcendental. 


(2) If @ is an algebraic irrational number, then e’"? is transcendental. 


Prove that these two statements are equivalent, and are implied by the 
Gelfond—Schneider Theorem. 


A transcript of Hilbert’s address may be found at [32]; there is an English 
translation at [33]. 


APPENDIX 1: ROOTS AND COEFFICIENTS OF POLYNOMIALS 


Let f be a monic polynomial of degree n given by 


f(z) = 2" + an 12" 1 +++ +az+a0, 


and let a1,Q@2,...,Q@p, be the roots of f, repeated according to multiplicity. 
Then 

a, +ag4+-+++ Qn, = —Gn-1 and ayag::: a, = (—1)"ao 
“and so on”. 


APPENDIX 2: SOME REAL AND COMPLEX ANALYSIS 


Complex integration. The complex contour integral 


- flz)dz 


can be defined as a limit of Riemann sums. If the path of integration C is 
parametrised as 


z=o(t)+w(t) for a<t<b, 


and if we write 


f(z) = F(a + ty) = ula, y) + iv(a,y) , 
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then 
b b 
[ fede= f ul6.e)o —o(6,uu' at+i f w(o,uyu! + 0(6.w)o at 
Cc a a 
expresses the integral in terms of real integrals. A complex function f is said 
to be analytic at a certain point if it is differentiable throughout some neigh- 
bourhood of that point, and entire if it is analytic at every point in the 


complex plane. If f is an entire function (this condition can be relaxed) and 
if F is another entire function such that F’ = f, then 


B 
7 f(z) dz = F(8) — F(a) 


is independent of the particular contour from a to 2. That is, provided that 
a complex function is entire, it can be integrated by the same methods as are 
customarily used for the integration of real functions. 


Continuous real and complex functions. Let D be a closed bounded 
subset of C, and let f be a continuous complex function whose domain includes 
D. Then f is bounded on D: that is, there exists a constant M such that 


If(z)| <M 


for all z in D. Moreover, though we don’t need the fact for the purposes of 
this chapter, f actually achieves its maximum: there is a point z,y in D such 
that 


lf(z)| < |f(zm)| 
for all z in D. 


The above properties are also true for real functions: we can regard these 
as special cases of complex functions, and R as a subset of C. In this case D 
will frequently (though not necessarily) be a closed bounded interval [a,b] on 
the real line. 


Complex logarithms and powers. The logarithm of any non-zero complex 
number z is defined by 


log z = In|z| + iarg(z) , 


where In denotes the real natural logarithm function; since a given z has many 
arguments, its logarithm will have many values. If z is a non-zero complex 
number and is any complex number, we define the (multi-valued) power 


2? = exp(Blogz) . 
For example, 


log i = In|i| + ¢arg(i) = i(t + 2kr) ¢ RED 
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therefore 
i = exp((-21)i(2 + 2hm)) = exp(n+4km), kEZ; 


and one of the values of this expression is e”. 


APPENDIX 3: ORDERING COMPLEX NUMBERS 


Suppose that we have a set S on which an order is defined. We can use this 
order to define a related order on S”, the set of n-tuples of elements from 
S. We say that (51,...,5n) ~ (t1,...,tn), sometimes read as “(s1,..., Sn) 
precedes (t1,...,¢n)”, if and only if there exists an index k such that 


s; =t; wheneverg<k, and sp<ty. 


That is, for the first k at which the n-tuples s and t differ, the element of s 
is the smaller. 

This is known as the (strict) lexicographic order on S”, and a little 
thought shows that it is really just traditional alphabetical order. To decide 
which of two words comes first in the dictionary, we compare their first letters; 
if they are the same we compare the second letters; and so on. We define the 
lexicographic order on the complex numbers by regarding them as ordered 
pairs of real numbers: specifically, z; < z2 if and only if 


Re(z1) < Re(zz) or Re(z1) = Re(z2) and Im(z1) < Im(z2) . 


We also write z1 < z2 to mean that either z; ~ zg or 21 = 22. 


Lemma 5.23. Properties of lexicographic order. The lexicographic order on 
C has the following properties. 


1. Trichotomy: for any 21,22 € C, exactly one of the following is true: 
Ry ~< 22 OT 21 = 2Q OT 22 ~ 2. 


2. The order is transitive: for any 21, 22,23 € C, if 21 ~< z2 and zg X 2s, 
then 21 ~ 23. 


3. Any non-empty finite subset A of C has a smallest element, that is, an 
element a such that a ~ z for all z € A other than z =a. 


4. The order is compatible with addition: for all z,,22,w € C, if z1 ~ z 
then z1 +w ~ zg +w. 


5. If 2. X zg and w, X we and at least one of the inequalities is strict, then 
Zy + Wy, ~ 2+ Wo. 


The proof is left as an exercise. 
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Comment. Where does this definition leave the well known idea that “it is 
impossible to order complex numbers”? Well, what is actually meant by this 
statement is not that it is impossible to order C at all, but that it is impossible 
to order C in a way that is “satisfactory” for most algebraic purposes. Specifi- 
cally, we want an order < which has the properties of trichotomy, transitivity 
and compatibility with addition already mentioned and is also compatible 
with multiplication: that is, for all 21, z2,w € C, 


if z3<~z2andw>0O, then zw ~ zow. 


Exercise. Show by means of an example that the lexicographic order on C is 
not compatible with multiplication. This means that the lexicographic order 
does not conflict with the “impossibility of ordering complex numbers”. In fact, 
it can be proved that there is no way of ordering C which is “satisfactory” in 
the sense of the previous comment. Further details may be found in [4]. 


CHAPTER 6 


Automata and 
Transcendence 


A star dies in an exponential arc 
And we dribble toward mystery, 
Leaving a trail of random decimals. 


9 


Tom Petsinis, “A Transcendental Meditation’ 


1 VIEW OF LIOUVILLE’S THEOREM concerning the poor approximability 
properties of algebraic numbers, we know that a real number must be tran- 
scendental if it is “too well” approximable by rationals. In the examples 


a= 3 10-27 and A= 3 10-*! 
k=0 k=1 


on pages 43 and 50 we found (or attempted to find) good approximations by 
looking at patterns in the decimal expansions of these numbers and truncating 
the expansions at suitable points. Similarly, in Chapter 4 we obtained good 
rational approximations to various numbers — 7, the number of days in a year, 
log 2/log 3 — by truncating their continued fractions in an appropriate manner. 


These examples suggest that if a number can be expressed in some way 
(a decimal or a continued fraction; perhaps also an infinite series or product) 
where the digits or coefficients form some kind of pattern, rather than just 
a random sequence, then the number “ought to be” transcendental. For by 
taking advantage of the pattern we may hope to truncate the expression at 
points which will yield exceptionally good rational approximations; and we 
know that a number having such approximations must be transcendental. 


It is clear, however, that this argument cannot be taken too literally. For 
a start, it fails for the simplest possible kind of pattern, a periodic sequence, 
since a decimal with periodic digits is rational and a continued fraction with 
periodic partial quotients is algebraic (of degree 2). Apart from this, the ar- 
gument is far too vague to be more than a general guide: many points are 
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0 


| C) 
—— =@)_ ou 


Figure 6.1 A deterministic finite automaton. 


left obscure, most importantly the question of what exactly is meant by “a 
pattern”. Defining this term, or, to put it otherwise, distinguishing between 
“order” and “chaos”, is a problem which involves (at least!) mathematics, 
philosophy and psychology, and which perhaps has no sharply defined answer 
anyway. To make progress we shall not investigate patterns in general but 
shall concentrate on producing sequences of numbers by methods sufficiently 
simple that we may reasonably regard the results as being patterned, though 
not exhaustive of all possible patterns. We shall then show that, subject to 
certain technical conditions, these sequences define real numbers which can 
be proved transcendental. 


6.1. DETERMINISTIC FINITE AUTOMATA 


A deterministic finite automaton is, roughly speaking, a machine for sorting 
strings of letters into classes. We can identify a non-negative integer with 
its string of digits in some base (with the conventions that 0 is represented 
by the empty string, and that no string begins with zero), and can therefore 
regard a DFA as a machine for classifying numbers. A simple example of a 
DFA is shown in figure 6.1. The circles and double circles are called the states 
of the DFA, the numbers inside them being merely labels by which we refer 
to these states. The arrows between states, marked with numbers, are the 
transitions of the DFA, and the arrow “coming from nowhere” indicates the 
initial state. The transition marked with two numbers and attached to state 
3 is really two transitions which have been combined to simplify the diagram. 
The state marked with a double circle is an accepting state while those with 
single circles are non—accepting or rejecting states. As far as applications to 
transcendence are concerned, these ideas should suffice; for readers who care 
to pursue the topic further, a more formal definition of a DFA is given in the 
first appendix to this chapter. 


Let r be a positive integer and consider a deterministic finite automaton M 
over the alphabet {0,1,...,7—1}, that is, a DFA whose arrows are marked 
with these r digits. To see how M “classifies” a non-negative integer k we 
write k in base r, start in the initial state and follow the arrows labelled with 
the digits of k, read from left to right. If after “processing” the last digit we 
have arrived at an accepting state we say that M accepts k, if not we say that 
M rejects k. The sole function of the DFA (though we may slightly modify 
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this later) is to partition the set of natural numbers into two classes, those 
accepted and those rejected by M. We may then define the characteristic 
sequence of M with terms 


f if M accepts k 
ak = 


0) if not, 
and may write down a “decimal” ag.a,a2--- in any base b > 2. That is, we 
define 
Co 
Qk 1 
aM = a9.€102°*: = te y a 
k=0 M accepts k 


More generally, we regard ay as being the value at 1/b of the function 


lo) 
i= > oe" = by g, 
k=0 M accepts k 


This is the generating function of the sequence { a, }, and we shall some- 
times also (perhaps imprecisely) refer to it as the generating function of M. 


In the example depicted above, we have r = 2 and so we classify numbers 
according to the properties of their binary representation. By convention the 
first digit of a number is not zero, although this particular automaton has been 
designed in such a way that the final outcome is no different if it is. It is not 
difficult to see that in this case a string of zeros and ones leads to and remains 
in the accepting state, state 2, if and only if it consists of a one followed by a 
string (possibly empty) of zeros. Therefore, the numbers accepted by the DFA 
are precisely the powers of 2. The automaton has the generating function 


fea Joey 2 5 (6.1) 


kisa m=0 
power of 2 


and taking, for example, b = 10, we may consider the real number 


1 Oy sth 
— f(=) = S~ 10-?” = 0.11010001000000010000--- . 


m=0 


This will be recognised as a number that we have considered as far back as 
Chapter 1. We noted there that it is easy to see that the decimal expansion 
of a is not periodic and so a is irrational; in Chapter 4 (exercise 4.10) we 
showed that a is not approximable to order greater than 2, and therefore is 
not a Liouville number. We have also asserted that a is transcendental, and 
we shall soon be able to prove this. 


In his autobiographical note Fifty Years as a Mathematician [43], Kurt 
Mahler describes how he began to study transcendence theory in 1926. 
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During a part of that year I was very ill and in bed. To occupy myself, 
I played with the function [given above, equation (6.1)] and tried to 
prove that f(¢) is irrational for rational ¢ satisfying 0 < |¢| < 1. I 
succeeded and ended by proving that f(¢) is transcendental for all 
algebraic numbers ¢ satisfying this inequality. 


We shall study his solution of this particular problem before seeking to gen- 
eralise the method of proof to further examples. 


Comment. Deterministic finite automata are actually extremely weak, and 
can only recognise the simplest kinds of patterns. (In fact, a DFA is the weakest 
type of automaton normally regarded as “interesting”.) For example, it is 
possible to prove that there is no DFA which will accept the set of squares 
{0,1,4,9,16,...} and reject the non-squares; so the number 


a= > 10-** = 1.10010000100000010000000010000 - - - 
k=0 


from page 7 cannot be produced by the method we have discussed. Roughly, 
this is because being a square is a “number-—theoretical” property, while DFAs 
can only recognise “typographical” properties. For instance, each of the fol- 
lowing sets of natural numbers is accepted by some DFA: 
{ numbers with two consecutive 1s in their base 2 digits } , 
{ numbers for which the last 1 is followed by a 2 in base 3} , 
{ numbers with an odd number of digits in base 4} . 
A set which is defined by a number-theoretic property but which is accepted 


by a DFA is the set of all numbers which cannot be written as the sum of 
three squares, 


S = {7,15, 23, 28, 31, 39, 47, 55, 60, 63,...}. 


Exercise. Explain this apparent exception to the “typographical” principle! 


6.2 MAHLER’S TRANSCENDENCE PROOF 


We begin Mahler’s proof by observing some simple facts about the function 
defined in equation (6.1). First, note that f(0) = 0; also, f is defined by a 
Taylor series convergent for |z| < 1 and is therefore analytic in the open unit 
disc. So we take ¢ to be a (complex) algebraic number satisfying 0 < |¢| < 1, 
assume that f(¢) is also algebraic, and seek to obtain a contradiction. 


The connection between automata and transcendence relies on obtaining 
a functional equation for the generating function. In the present case, we have 


f(z)azte22tetg eb.) f(e*)H 22? 4244-224 2164... 
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and therefore 
fi) =fl2)+2; (6.2) 


we shall show that the characteristic sequence of any DFA will lead to either 
a functional equation of this type, or a system of several such equations in 
several functions. 


The next step is to iterate the functional equation (6.2) — that is, repeatedly 
substitute the left-hand side into its own right-hand side — to obtain 


fA=fA+274+z2=f()+4+2+2 
and in general 
FAQ H=KP te te t+At ete 


for any t > 0. The point of this is that then 


fe VefOSeC =e F=C 5c (6.3) 


is an algebraic number which is in a sense no more complicated than ¢ and 
f(¢) themselves, but which approaches zero very rapidly as t tends towards 
infinity. We shall show, however, that a non-zero algebraic number of fixed 
degree and small denominator cannot be too small in absolute value. This 
is analogous to the fact that a non-zero rational with denominator q cannot 
have absolute value less than 1/q, which we have used in proving various 
approximation results (see, for example, Lemma 3.18). The inconsistency in 
the two estimates for the size of our algebraic number will produce the desired 
contradiction. 


In fact, despite the iterated exponential, the expression f (c*) still does 
not approach zero fast enough for our purposes, and we must consider the 
so-called auxiliary function 


s 


E(z) = DJ aj(z) f(z) . (6.4) 


j=0 


Here s is a positive integral parameter which, as in proofs by Hermite’s 
method, we shall specify later, choosing it large enough to obtain a contra- 
diction. The expressions a;(z) are polynomials, not all zero, having integral 
coefficients and degree at most s; we choose these polynomials in such a way 
that E(z) has a power series 


E(z)= S- ene” (6.5) 
k=s? 


in which the first s? coefficients vanish. To see that such a choice for the 
polynomials is possible, write out the s +1 polynomials a;(z) in terms of 
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their (s + 1)? unknown coefficients, then expand the sum (6.4) to obtain the 
first s? terms of the power series (6.5); equate the coefficients of all these 
terms to zero. Then to find the coefficients of the polynomials, we have to 
solve a homogeneous system of s? linear equations in (s+ 1)? unknowns, with 
rational coefficients; but, as is shown in appendix 3, such a system always has 
a non~zero rational solution. Multiplying by a common denominator gives a 
non~zero integral solution, as claimed. To clarify this argument we carry out 
the details with s = 2. Set 


E(z) = (aoo + ao1z + ao227) + (aio + airz + ai2z7)(z +27 +24 +--+) 
+ (d29 + aa1z+ 2227) (z ae gy? ig shee eg 


= aoo + (a1 + @10)% + (@o2 + 10 + G11 + a29)z” 


3 . 
(11 + d12 + 2a99 + Goi)Zz° +++ 3 


we want to find coefficients a;,, not all zero, in such a way that the terms of 
degree less than 4 vanish. Since it is sufficient that the coefficients satisfy four 
homogeneous linear equations in nine unknowns this is certainly possible, and 
we can find by trial and error a solution such as 


ao0 = 401 = G10 = 412 = 429 = 22 =0, Gog =a21=1, ay =—l. 


That is, we may choose 
E(z) = 2? — zf(z)+zf(z) ; 


by substituting the series for f(z) and expanding we can check that the series 
for E'(z) has no terms of degree less than 4. 


We must also rule out the possibility that F(z) is identically zero. A func- 
tion f is said to be algebraic over the set Q[z] of polynomials with rational 
coefficients if there exist polynomials ap, @n—1,..-,@1,@o in Q[z] such that a, 
is not the zero polynomial and 


dn(z) f(z)” + an—1(z) f(z)" +++ + a1(z) f(z) + ao(z) = 0 (6.6) 


for all z; the function is said to be transcendental over Q(z] if there are 
no such polynomials a;. These definitions are analogous to the definitions of 
algebraic and transcendental numbers that we introduced in Chapter 3. 


It can be shown — see appendix 2.4 — that the function f which we are 
now considering is a transcendental function. Therefore, no equation such as 
(6.6), with coefficients not all zero, can hold for all z, and it follows from (6.4) 
that E(z) is not identically zero. 


As shown in appendix 2.3, we can estimate the size of E(z) by taking, 
essentially, its first term alone. Since there exists a complex number (6 with 


IC | < s+ <|C] < [C1 <A] <1; 
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and since the series for f(z), and hence also that for F(z), converges at z = (; 
we may use the lemma on estimation of power series in the appendix to see 
that 


JE(C7)| < e(s)|¢\* . (6.7) 


Note that the constant c depends on the function E and therefore on the 
parameter s — we have indicated this by writing c(s) and not just c — but does 
not depend on the value z = ¢ 2" at which E is evaluated, and therefore does 
not depend on t. Observe also that since |¢| < 1, the right-hand side of (6.7) 
is a decreasing exponential and will be very small for large values of t. 


Next we define the size of an algebraic number in a way which will serve 
to measure the number’s “algebraic complexity”. Recall that for any algebraic 
number ( the denominator, denoted den /, of ( is the smallest positive integer 
d such that d@ is an algebraic integer, and that if 6 has degree n, then the 
conjugates of 3 are the n complex numbers (including @ itself) which have the 
same minimal polynomial as (. 


Definition 6.1. Let 8 be an algebraic number with denominator d and con- 
jugates 31, 82,...,8n. The algebraic size of 3 is 


I|6|| = max(d, Pl, |2|, eae) [Bnl) * 


Note that since d is a positive integer, ||| is always at least 1. 


Examples. 


e If 6 = V3— v2, then @ is an algebraic integer and so its denominator 
is 1. The conjugates of 8 are +/3 + V2, and the largest of these in 
absolute value is V3 + V2. So ||8|| = V3 + V2. 


e Let 8 = cos 47; as we saw in Chapter 3 on pages 33 and 37, the conju- 
7 g 
gates of 8 are cos 47, cos 37, cos 27 and its denominator is 2. Since the 


conjugates of 3 are less than 1 in absolute value, we have ||/3|| = 2. 
e For 6 = 0 we have ||| = 1. 


The application of this measure to transcendence proofs is based upon the 
following result. 


Lemma 6.1. The fundamental inequality for algebraic size. If 8 is a non-zero 
algebraic number of degree n, then 


al lisil" > 1. 


Comment. Roughly speaking, this inequality says that a non-zero algebraic 
number cannot simultaneously have small denominator and small conjugates, 
just as — still roughly speaking — a non—zero rational number cannot simulta- 
neously have small denominator and be small itself. 
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Proof. Let 6 have denominator d and conjugates (1, G2,...,8n. Then df is 
an algebraic integer whose conjugates are d§,,d$2,...,d8,, and so we have a 
polynomial equation of the form 


(z — dBi)(z — dB2)---(z — dB,) = 2” + by_iz™ > +--+ + biz t bo , 


where the coefficients b;, on the right-hand side are rational integers. By ex- 
panding the left-hand side we obtain 


|d1||d82| ---|dBn| = |bol ; 


since none of the conjugates d;, is zero, by) cannot be zero and we have 


1 < |bo| = d”|61||62l---1Bnl < GI" IBIEI"* < 6B?" , 
which is the desired result. 


Algebraic size has the following properties, which will be proved on 
page 157. 


Lemma 6.2. Let (1, 82,...,8m be algebraic numbers, and suppose that d is 
a common denominator for these numbers (that is, dB; is an algebraic integer 
for every j). Then 


© ||. + B2+-+++ Bml| < max(d, || G1|| + ||Fell +--+ + [Bmll); 
© |[F1 + Ba +-+++ Bmll < ml Fr|IFell---Bmll; and 
@ |[2162---Bml| < |]Frl|IGall--- ||/Gmll- 


Observe that E(¢ 2") is an algebraic number for any positive integer t. For since 
¢ is algebraic and we have assumed that f(¢) is also algebraic, relation (6.3) 
shows that f(C2’) is algebraic; hence 


s 


E(C*) = Yas (C )F(C (6.8) 


i=0 


is an expression consisting of sums and products of algebraic numbers and is 
therefore itself algebraic. We can use the above properties to estimate the al- 
gebraic size of E(¢2'); provided that E(¢2' ) is not zero, this estimate, together 
with our estimate (6.7) for its absolute value, will contradict the fundamental 
inequality for algebraic size and will hence show that f(¢) is transcendental. 


From (6.8) we see that E(¢2') can be regarded as a polynomial in two 
variables z; and z2, with degree at most s in each, evaluated at z; = ¢? and 
zq = f(¢? ). Hence Corollary 6.3 gives the estimate 


E(C? YI] < e(s) cle? WFC? Ile 


Automata and Transcendence @ 155 


But equation (6.3) expresses f(¢2)) as a sum of t+ 1 terms in ¢ and f(C), so 
the properties stated in our most recent lemma give 


FCS EF DIFOMMN? = MCUAICIPIC < + DIFOMCI 


and therefore 


EC? Il < (s+ US FCOMPMCI2 (6.9) 


We also need to find out something about the degree of E(C2’). To do this 
we use results on algebraic numbers that we obtained in section 3.1.2. Let V 
be the vector space over Q spanned by all terms of the form ¢/ f(¢)*. Then 
by rewriting the expression (6.8) as 
t s t t—1 ) 
E(C*) = SU (ajo +21 + 506) (FQ) — = — P=)? 


j=0 


and expanding, we see that E(¢ 2°) is a finite sum of such terms, and hence is an 
element of V. But V has a spanning set consisting of finitely many monomials 
(/ f(¢)*, namely, those for which 0 < j < deg¢ and 0 < k < deg f(¢); so V is 
finite-dimensional, say dim V = n, and we have deg E(¢?') < n. Observe that 
n does not depend on s or t. 


Our aim is to show that what we know about the degree, absolute value and 
algebraic size of E(¢?') contradicts the fundamental inequality, Lemma 6.1. 
For this inequality to be valid, E(¢2') must be non-zero: we shall use results 
of complex analysis to prove that this is so for all sufficiently large t. Indeed, 
if there are infinitely many t with E(C") = 0, then E(z) has a sequence of 
roots tending to the origin, a point at which E(z) is analytic. By the result in 
appendix 2.3, this implies that E(z) is identically zero. But we already know 
that this is not so, and therefore E'(¢ 2") is non—zero for all sufficiently large t. 

We are now ready to choose s and ¢ in such a way that the estimates (6.7) 
and (6.9) for the absolute value and algebraic size of E(¢2') are incompatible. 
To simplify (6.7) write c, = |¢\~!/?, noting that |¢| < 1 and so c; > 1. Then 
increasing powers of c; tend to infinity; therefore for any fixed s we can choose 


t large enough that c(s) < c? 2", and consequently 


ie aera ar 


We use similar ideas to simplify the estimate (6.9): for any s we can find t 

t t 
sufficiently large that c’(s) < 2°? , and it is easy to see that t+ 1 < 2? for 
any t > 0. Hence 


E(C? Il < 2° 2°? FCI? ICI? = 3” 


where cz = 4 || f(¢)||||¢||? is a constant independent of s and t. The fundamental 
inequality for algebraic size now shows that 


oF ant > 1, (6.10) 
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provided that E(¢?") # 0. Choose an integer s > 2nlogc2/logc1, and then 
choose t sufficiently large that the simplifications made in this paragraph are 
valid and E(¢? ) is non-zero. Then 


qe tan _ (c2" /e8)92" ra 
which contradicts (6.10) and establishes that f(¢) is transcendental. 
Although this is rather a long proof, it can be broken up into somewhat 


less intimidating sections. In summary, we have considered the function 
co 
F(z) = oe =zt274244 284... 
m=0 
and an algebraic number ¢ satisfying 0 < |¢| < 1; we wish to show that f(¢) 
is transcendental by assuming the contrary and deriving a contradiction. 


e Find a functional equation for f(z); iterate it, giving a formula for 
f (2?"). 
e Construct an auxiliary function E(z) whose power series begins with a 


high power of z. The construction involves a parameter s, as yet unspec- 
ified. 


e For any t, estimate the absolute value of E(¢ ae 
e Estimate the algebraic size of E(¢ ae 
e Show that the degree of E(¢?’) is independent of s and t. 


e Show that for a suitable choice of s and t, the estimates we have made 
contradict the fundamental inequality for algebraic size; conclude that 
f(¢) cannot be algebraic. 


6.3. AMORE GENERAL TRANSCENDENCE RESULT 


We seek to generalise the example just investigated. The proof of the gen- 
eralisation will follow exactly the same lines as that of the example, and we 
shall set it out according to the steps listed at the end of the previous section. 
First, we need to prove the properties of algebraic size stated above, as well 
as various other properties which will facilitate a more general argument. 


Lemma 6.2. Properties of algebraic size. Let 31, 82,..., 8m be algebraic num- 
bers, and suppose that d is a common denominator for these numbers, so that 
dB; is an algebraic integer for all j. Then 


@ ||61 + Bs +--+ + Bml| < max(d, [|F1]| + |]62ll +--+ + []Bmll); 
© ||81 + Ba +--+ + Bml| S ml|Ailll2ll---Bmll; and 
@ [|6182--*Bmll < WFiIMGell--- [Bmll- 
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Proof. For any algebraic numbers ( and y, each conjugate of 6 +7 has the 
form 8; +7, where 8; and 7, are conjugates of 8 and +. So if 6 is a conjugate 
of 8+ 7, then 


5] = [65 + vel S 1831 + lol S MBI + Ill - (6.11) 


If d is a common denominator of 8 and 7, then d(3+y) = d8+dy is a sum of 
algebraic integers and so is itself an algebraic integer. Thus the denominator 
of 6+ 7 is at most d; this, together with (6.11), shows that 


[8 + yl] < max(d, |]6|] + [l7l) - 


The inequality is easily extended by induction to sums of m algebraic numbers, 


and this proves the first claim. To prove the second, let d1,d2,...,dm be 
the denominators of 6, 82,..., 8m respectively; then it is easy to see that 
d,dz---dm is a common denominator for (31, G2,...,8m, and the result we 


have just proved shows that 


[61 + B2 +++ + Bm|| < max(did2 +--+ dm, |[Frl| + [|62ll +--+ [Bmll) - 


However 


dydz+++dm < ||PilllFall -+-Bmll < ml] FillBell---Bmil 


and since ||@;|| is always at least 1, we have 


Bal] + [Ball + +--+ [ml < WxIMG2ll-+- [Brill + +++ + WLM Gell «++ [Bm 
= ml|A1|[|]2ll-- + lBmll ; 


sO 


\|G1 + Bo +++++ Bm|| < ml] G1|||]2l] «++ |Fml 


as claimed. The proof of the third result is a slightly simpler application of 
the same ideas, and is left as an exercise. 


Corollary 6.3. Algebraic size of a polynomial expression. Let p be a poly- 
nomial in m variables, having degree ny in the kth variable, with algebraic 
coefficients. Then there is a constant c, depending only on p, such that for all 
algebraic numbers a1, 02,...,Qm, we have 
I|p(a1,Q2,.-+,Am)|| < ellaa||!"* +--+ |lam||"™ - 

Proof. For each k, let dy, be the denominator of a;; let d be a common 
denominator for the coefficients of p. Then D = ddj'd)? ---d%™ is a common 
denominator for all the terms comprising p(aj, @2,...,@m); clearly 


D<dlloq||™ +++ [lam||"™ . 
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Using the third of the above properties, each term in p(a1,Q2,...,Qm) has 
algebraic size 
[Pisdm OL RL SMP s2--5mm ll loa]? += [loll 


S [IP isd Neal”? = ++ llam||"™ , 


and the sum of the algebraic sizes of all terms is at most 


( Di ln sll) loca" «>= [lamll*"* = P [lee +++ lleomll”™ , 


geeey, 


say. So by the first property, 


||p(a1,02,...,0m)|| < max(dlarl|"* ++ lam!" P llea||"* +++ llomll"”) 


[as \" , 


= ¢llaa||" +++ llam| 


where c = max(d, P) is a constant which depends only on the polynomial p. 


Next we prove a result concerning products of polynomials evaluated at 
powers of a fixed algebraic number. This will be required in proving Theo- 
rem 6.6, which generalises our earlier example. 


Corollary 6.4. Algebraic size of a polynomial product. Let py, p2,p3,.--, Pt 
be polynomials with integer coefficients, having degree less than m and height 
less than h. Let ¢ be a non-zero algebraic number, and let r be an integer, 
r >2. Then 


IIx (6) p2(C") ps(G™) + pe(Gr”) || < mbar” 
Proof. The expression 


p1(C) p2(¢") p3(C") +» p(C") (6.12) 


is a sum of at most m* terms c¢*, where each coefficient c is an integer with 
\c| < h*, and the degree satisfies 


t—-1 t 


k<m+tmr+mr?+---+mr <mr’. 


Each term has algebraic size at most |c| ||C||* < h*||C||’""’; and all these terms 
have a common denominator (den ¢)'" , which is therefore a denominator 
for 6.12. Noting that this denominator is at most ||¢ \|nr" and using the first 
of the properties of algebraic size in Lemma 6.2, we have 


Cn ee eee eer 


and the result follows. 


t 
mr 


The following result has already been proved on page 153; we restate it 
here for convenience. 
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Lemma 6.1. The fundamental inequality for algebraic size. If 3 is a non-zero 
algebraic number of degree n, then |3| ||\|?” > 1. 


Finally, we shall need an estimate for the size of the reciprocal of a non-zero 
algebraic number. 


Lemma 6.5. Algebraic size of a reciprocal. If 3 is a non-zero algebraic num- 
ber of degree n, then ||B~*|| < ||6||?”. 


Proof. Suppose that ( is a non-zero root of the irreducible polynomial 
byn2”™ + bn-1z” 1 +++ +b1z + bo (6.13) 


whose coefficients are rational integers with no common factor. Then 67! is 
a root of 
boz™ + bye”) +--+ + bya + by , (6.14) 


which is also irreducible. Consequently, the conjugates of 6~' are the recip- 
rocals of the conjugates (1, 62,..., 8, of 6. From the fundamental inequality, 
we have 


185 *1 < 1163?" = [16|?" . 
Now let d be the denominator of 6. Using (6.14) and then (6.13), we have 
den(8~") < |bo| = |bn||81||B2|-+-|Bn] < [bn |I8I" - 
On the other hand, d§ is an algebraic integer, so there is a polynomial equation 


d” 8” + cp_1d” "8" +---+e,dB +e =0, 


where the coefficients c, are rational integers. Comparing this with (6.13), in 
which the b; have no common factor, we see that |b,,| < d"; hence 


den(8~*) < d"||B|/" < |||?" . 


Combining the estimates for the denominator and conjugates of 8~! estab- 
lishes the lemma. 


All these results on algebraic size enable us to prove a generalisation of 
Mahler’s original example. 


Theorem 6.6. Functional equations and transcendence. Let f be a function 
having a Taylor series with integer coefficients, convergent inside the unit 
circle. Suppose that f satisfies a functional equation 


f(z) = a(z)F(2") + (2) , 


where r > 2 is an integer, a(z) and b(z) are polynomials with integral coeffi- 
cients, and a(z) has no roots inside the unit circle, except possibly at z = 0. Let 
¢ be an algebraic number with 0 < |¢| <1. If f is a transcendental function, 
then f(¢) is a transcendental number. 
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Proof. Step 1: iterating the functional equation. We have 


= a(z)a(z") f(z”) + a(2)b(2") + (z) 


a(z)a(2")a(z”) f(z") + a(z)a(2")b(2") + a(z)b(2") + (2) 


= Ax(z) f(z”) + Bilz) , 


where 


and 
Bi(z) = b(z) + a(z)b(z") +--+ + a(z)a(z")---a(z" biz” +) —-. 


Step 2: construction of the auxiliary function. For any positive integer s there 
exist polynomials po(z), pi(z),...,ps(z) with integral coefficients, not all zero, 
having degree at most s, such that there is a power series 


Bl2) = Soi) F(2)) = So exz* . 
4j=0 


k=s? 


This is true because it requires that the (s+1)? coefficients of the polynomials 
satisfy a system of s? homogeneous equations with integral coefficients; since 
there are more unknowns than equations this system certainly has a non— 
trivial rational solution, and multiplying by a common denominator gives an 
integral solution. Moreover, E(z) is not identically zero because f(z) is a 
transcendental function. 


Step 3: estimation of absolute value. Since f(z) is analytic inside the unit 
circle, so is E(z). From the estimate (6.20) in appendix 2.3, we have 


JE(C")| <e(s) Ie?” , 


where the constant c(s) does not depend on t. Let cy = |¢|~!/?. Then c; > 1, 
and the above inequality can be written 


|E(¢")| < e(s) ¢ 2°" : 
Note that the constant c; is independent of both s and t; the same will be 


true of the constants c2,c3,... which will arise in the course of the proof. 


Step 4: estimation of degrees. From the functional equation we have 


C) = oo 
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noting that the denominator is not zero since a(z) is not zero at any of the 
points ¢ r® Hence f (cr) is an element of the vector space over Q spanned by 
products of powers of ¢ and powers of f(¢), and so too is 


= vi(c")4(o"y 


Since by assumption both ¢ and f(¢) are algebraic, this space has (finite) 

dimension n, independent of s and t. 

Step 5: estimation of algebraic size. We can regard E(¢ r) as a polynomial in 

two variables z1, 22, having integral coefficients and degree at most s in each, 
t t 

evaluated at z] =¢" and zg = f(¢r ). So by Corollary 6.3, we have 


IEC) <e(oler WFC IP 


Now suppose that a and b have degrees less than m and heights less than 
h; note that these numbers are independent of the parameters s and t. Then 
A;(¢) is an expression of the type considered in Corollary 6.4, and B;(¢) is a 
sum of ¢ such expressions having a common denominator at most (den ¢)'"” 
therefore 


|Ar(C)|]| < ment |g < ch and ||By(6)|] < tm*h*|||™™ < (2e2)™ 


a) 


where cp = mh||¢||'” is a constant which does not depend on s or t. The lemma 
concerning the algebraic size of a reciprocal then yields 


Fe -|AS9| 


< lTF() — Bel¢ Ml || Ar(g)|?" 
< 2I|F(C)|| (Zea) o3"" 


< ch 
with c3 = 4|| f(Q)||3"*", and so 


EC" )| < d( s) ||C||°" 8 srt = (s) er . 
Step 6: conclusion. Choose s such that cf” < c}; this can be done since c; > 1 
and c;,c4 and n do not depend on s. Then we can choose t with the following 
properties: 


e c(s) < er" and c’(s) < ci" : this, again, is possible because c(s), c/(s), 
c, and cy do not depend on t¢; 
e E(¢ ") # 0: this is possible because E(z) is analytic and not identically 


zero, and therefore cannot have a sequence of roots converging to zero 
(see appendix 2.2). 
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Finally, the fundamental inequality for the size of a non-zero algebraic number 
gives 
rt rt 2n 
JE" IB") I" 2 1 


while from the above estimates, we have 


IEC") IEC") IP" < els)er?"e(s)?epner 
Z c37r" ox 28r* (ansr oansr' = (2)" a 
1 


We have a contradiction, and the result is proved. 


Comment. In order to simplify the proof, we have assumed more than we 
really need to in the statement of Theorem 6.6. For example, the only reason 
for the stipulation that a(z) has no roots inside the unit circle was to ensure 
that : 2 

Ai(C) = a(6)a(6")a(g"") ---a(c""*) 


is not zero; clearly it would have been sufficient to demand that 
a(C”)#0 for k=0,1,2,3,.... 


Nor is it essential that a(z) and b(z) be polynomials; if they are rational func- 
tions with integral coefficients the proof works in much the same way, though 
the estimates involving A; and B; become more intricate. Finally, the require- 
ment that the coefficients of a(z) and b(z), and the Taylor series coefficients 
of f(z), be integers is also more stringent than necessary; it suffices that the 
field generated by all these coefficients be of finite degree when considered as 
a vector space over Q. 


6.4 A TRANSCENDENCE PROOF FOR THE THUE SEQUENCE 


We can apply similar methods to show that a “decimal” whose digits comprise 
the Thue sequence is transcendental. In Chapter 1 the sequence was defined 


recusively thus, 
1 1 if ak = 0 
dak = Gp, Gar41 =l—-ap= ; 
2k k 2k+1 k b. Hoped: 


with the initial condition aj = 0. Using the characterisation (page 8) of a, 
as the parity of the binary expansion of k, we see that the Thue sequence 
is recognised by the DFA ™M shown in figure 6.2. We can write down the 
generating function of M and rearrange the infinite series, 


foe) Co foe) 
k 2k 2k+1 
f(z) = 5 ane = S A2K2° + s A2k4+12 
k=0 k=0 k=0 


_ S- agz?* ie Sv — ag) 22*t+1 = (1 - 2) S- ay2?* de SS y2k+l 
k=0 k=0 k=0 
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bo 
—_— oS 
OO 
1 
Figure 6.2 A deterministic finite automaton for the Thue sequence. 


to obtain the functional equation 


z 
This functional equation does not fall within the scope of Theorem 6.6 since 
one of the terms involves a rational function which is not a polynomial. As 
mentioned above, this difficulty is not insuperable, and the theorem could be 
extended so as to cover this case; however, instead of doing so we can prove 


an irrationality result for f(¢) by considering a closely related function. 

Let { b, } be the Thue sequence on { 1,—1}, that is, the sequence which is 
obtained from the original Thue sequence when every 0 is replaced by 1 and 
every 1 by —1. It is not hard to see that bo = 1, that the sequence satisfies 
the recurrence 

bor = be,  bax+1 = —bx 
for k > 0, and that 
br =l1- 2az 


for every k. The generating function g(z) for { by } is related to the generating 
function f(z) for {az } by 


g(z) = Ds be = we — 2ax)z 
= rv (6.16) 
= So 2S 0 ayz* — —_ —2f(z). 
k=0 k=0 


If ¢ is algebraic and 0 < |¢| < 1, then this identity shows that g(¢) is algebraic 
if and only if f(¢) is algebraic. There is also a functional equation for g(z): 
we can use (6.16) and the known functional equation for f(z), or we can start 
from scratch to find 


[oe) Co Co 
g(z) = > b,z* = > bop 2?* + x bape 
k=0 k=0 k=0 


= bpz7* — bporett = (1- z)g(z) . 
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Incidentally, this shows that g(z) has an elegant representation as an infinite 
product, 


gz) =(1-z)A-2)a- 24) -2)---=]] (4-2). 


k=0 


Exercise. Use a combinatorial argument, together with the fact that a, is 
the parity of the binary representation of k, to prove the same result. 


It is now easy to show that if ¢ is algebraic and 0 < |¢| < 1, then g(¢) is 
transcendental. The conditions of our main theorem are satisfied by 


r=2, a(z)=1-—2z and b(z)=0; 

and using the results quoted from Pédlya and Szegé in appendix 2.4, we see 
that g is a transcendental function; so g(¢) is transcendental. It follows from 
previous remarks that f(¢) is also transcendental, and a particular corollary 
is that the decimal 


1 
r= (=) = 0.11010011001011010010110:-- , 


which we proved irrational in Chapter 1, is in fact transcendental. 


To employ the method of the present chapter it is not necessary to have a 
generating function in the form of an infinite series or product: the important 
thing is to be able to write down a suitable functional equation. For example, 
we could speculatively define f by the continued fraction 


1 1 1 
zo4+ 294 2274... 


faj=zt+ 


for |z| > 1; then f satisfies the relation 


which is somewhat similar to the functional equation (6.2), and we might 
attempt to use this equation as the basis of a transcendence proof. It is some- 
times necessary, as we shall see in the next few pages, to consider not single 
functions but n-tuples of functions, satisfying func- 
tional equations in which the coefficients are matrices 
of rational functions. 


Mahler’s work from the 1920s and 1930s was largely 
forgotten at the time but was revived and greatly ex- 
tended by Loxton and van der Poorten, Kubota, Nish- 
ioka, and others, during the 1970s and 1980s. It serves 
as a fascinating connection between the topics of au- 


tomata, functional equations and transcendence theory. ue oe 
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6.5 AUTOMATA AND FUNCTIONAL EQUATIONS 


To conclude this chapter we investigate the relation between deterministic fi- 
nite automata and functional equations. Recall that at the beginning of this 
chapter we calculated the set of numbers accepted by a given DFA, wrote 
down its generating function and by means of simple algebra found a func- 
tional equation satisfied by the function. Later on (section 6.4) we found an 
equation for the generating function of the Thue sequence by manipulating 
power series and using the recurrence relation which defines the sequence. It 
is possible, however, to derive a functional equation, or a system of such equa- 
tions, directly from a DFA, thus demonstrating the close connection between 
automata and functional equations. 


Let r > 2; we shall understand that when a natural number is written 
in base r, the first digit is never zero (and, in particular, 0 is represented by 
the empty string). Let M be a DFA over the alphabet {0,1,...,7—1} and 
suppose that M has s states, numbered 1,2,...,s5. For each state m and each 
non-negative integer k write fm, = 1 if the base r digits of k, read as usual 
from left to right, lead from the initial state of M to state m; and fm, = 0 
otherwise. We define s generating functions 


fm(z = 3 na 5 


that is, fm(z) is the sum of z* for all k which end up in state m when written 
in base r and “processed” by the automaton. We seek relations between the 
functions fm(z) and fim(z"). 

We shall begin by splitting up the sum defining f,,(z) in much the same 
way as we did on page 162. Dividing a non-negative integer by r to give 
quotient / and remainder j, we have 


love) 
x: fin: rktj% 


j=0 k=0 


a 


r— 


and we now need to determine the connection between the coefficients fim rk+j 
and fm... The digits of rk + 7 are those of k, with a digit 7 appended at the 
right-hand end, and so M takes rk + 7 to state m if and only if it takes k to 
some state m’ from which an arrow marked j leads to m. Symbolically, 


fmrk+j =1 ifand only if fm, = 1 and m’ 1, m for some m! ; 
and this may be restated as 


fmyk+j =1 ifand only if fm, =1o0r fms =lor... or fmyr=l, 
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Figure 6.3 A deterministic finite automaton. 


where m1, M2,..., mz are the states from which an arrow marked 7 leads to m. 
(Observe, incidentally, that m itself may be one of these states.) To simplify 
this result note that 


0 < dma + frist Sess + Savesk < »~ fim,k =1 ; 


all m 


the final equality being true because the prohibition on leading zeros means 
that the “processing” of k leads to one and only one of the states of M. 
We also know that each individual fi,, is either 0 or 1, and hence the last 
equivalence can be expressed as 


fmyrk+j =1 ifand only if Trak + fimo,k feee gf fimik =1, 


or simply 
fmrk+j = fine + fra, +++ + Sik . (6.17) 


Before proceeding further we illustrate with an example. 


For the automaton shown in figure 6.3, we have r = 2; and we shall first 
consider state 2. The only arrow leading to state 2 is that which originates at 
state 3 and is marked 0. An odd integer, written in base 2, ends with a 1 and 
so cannot be taken to state 2 by the DFA; thus fo2,41 = 0 for all k. An even 
integer ends in 0, and the DFA takes 2k to state 2 if and only if it takes k to 
state 3; so foo, = 1 if and only if fz, = 1. Since the fm,, can only take the 
values 0 and 1, this is equivalent to saying that foo, = f3,n- 

Consider state 3. By similar arguments to those above, we have f32; = 
fox. To evaluate f3 2441 we must note that two arrows marked 1 point to 
state 3; thus the DFA takes 2k +1 to state 3 if and only if it takes k either to 
state 1 or to state 3. We have 


faorti = fisr + far - 


To see this remember that each of the three terms is either 0 or 1. If fz o,41 = 
1, then processing & through M leads to state 1 or 3, so either fi, or f3,4 
must be 1, and as they cannot both be 1 their sum is 1; while if f3 2,41 = 0, 
then processing k leads to neither of these states, so fi. = f3,% = 0. 
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Continuing the general argument, we may use the recurrence (6.17) to find 
a system of functional equations for the fm(z). We have 


fm(z) = tae = ys ys Le? 
= 


j=0 k=0 k=0 m ‘pm 
r-1 fore) r—-1 
» k > . ti 
= z fmtkZ = a font (2 ) ’ 
j=0 m'sm k=0 j=0 mim 


here the inner sum is over all states m’ from which an arrow labelled 7 goes 
to state m. This identity may be written as 


fm(z = Wm, wi z) fim (z ) ) (6.18) 


m'=1 


where 


Wm! (Z) = y zs, 
J 
the sum being over all 7 for which the DFA M has an arrow marked j pointing 
from state m’ to state m. In fact, we may define a vector and a matrix 


fi(z) wii(z) +++ W1,s(z) 
f=] : | ama we-] : - : |, 
f(z) Ws,1(2) ae Ws,s(Z) 


whereupon the system of s functional equations becomes the matrix—vector 
equation 


f(z) = Wlz)f(2") . (6.19) 
Examples. 


e Consider the DFA in figure 6.2, which recognises the Thue sequence. In 
this case s = 2 and (6.18) becomes 


filz) = filz?) + zfo(z?) and fo(z) = zfi(z?) + fa(z?) . 


Since the only accepting state is state 2, the generating function for the 
Thue sequence is f(z) = f(z). Also 


fil) + fal) => and Ale?) + fal?) = 


1-22? 


and so the previous equation may be written 


fle) = 2( 5 - #9) + fF); 
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this simplifies to 
Hz) = (1-2) F(2?)+ 3 
in agreement with the functional equation (6.15). 


e For the automaton in figure 6.3 we may write 


fiz) 1 z 0 fi(z?) 
fo(z) |] =| 0 0 fo(z?) 
fa(z) z 1 f(z) 


1 

z 
The generating function of the DFA is f(z) = fi(z)+ fs(z) and we can, 
for example, eliminate f3 to give 


f(z) = filz*) + 2fa(2?), fle) = fiz?) + 2f(2*) + (14+ 2 fale’) ; 


since 1 
ie) + fo(2) ae 
we then obtain 
f@)= AG) 2) + —ag. He = he) 1E)+ 


The price we pay for reducing the number of equations from three to 
two is that the homogeneous system has turned into a non—homogeneous 
system; moreover, it appears to be impossible to derive a single func- 
tional equation for f(z) in terms of f(z”). To give transcendence proofs 
for numbers derived from arbitrary automata we would need to consider 
matrix functional equations such as (6.19). 


For the DFA on page 166 we might also seek to prove transcendence of 
the “decimal” 
1.3233123231331233123132--- 


in any base r > 4. Here, instead of just writing down a digit 1 in the kth 
place when the DFA accepts k and a 0 when it does not, we have written 
down the state arrived at when k is input. This number is f(1/r), where 


f(z) = file) + 2fe(z) + 3fa(z) , 
and again we could find a system of functional equations involving f(z) 
and some of the other fin(z). 


6.6 CONCLUSION 


In this chapter we have seen that certain decimals in which the digits form 
a simple “pattern” can be generated as the output of a deterministic finite 
automaton. We can use the automaton (or other means) to write the given 
decimal in the form f(1/b) and find an identity relating f(z) and f(z"), or 
a system of such identities; and in certain cases we can use these functional 
equations to prove the given decimal transcendental. 
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EXERCISES 


6.1 


6.2 


6.3 


6.4 


6.5 


Prove the third part of Lemma 6.2: if 1, G2,..., 2m are algebraic num- 
bers, then 


||G182-++Bml| < WGrll||B2ll--- WL Gmll - 


Give one example where equality does not hold, and one where it does. 
Also, show that the inequality 


I|B1 + Ball < [Ii l] + II Zall 


is not always true. 


An alternative definition of algebraic size. If a is a root of the irreducible 
polynomial f(z) = a@n2" + an_12"-' +--+ a9 with relatively prime 
integer coefficients, define 


lal] =1+H%)=1+ max la, 
Prove that any non-zero algebraic number «@ satisfies |a| ||a|| > 1, but 
that the stronger inequality |a| H(f) > 1 is not always true. 


Proving a function to be transcendental. Let p(z) be a (complex) poly- 
nomial with degree k and leading coefficient p,. Show that if c,,c2 are 
positive real constants with c; < |p| < ce, then 

cilz|* < |p(z)| < calz|* 


whenever |z| is sufficiently large. Use this to prove that the exponential 
function, f(z) = e”, is a transcendental function. 


For any non-negative integer k, let a, = 1 if k, written in base 10, 
contains the digits 0 and 1 only, and az, = 0 otherwise. Prove that the 
real number a with decimal digits a,, in any base b > 2, that is, 


Q = ap.a1a2a3a4--+ = 1.100000000110000--- = 7 : 
k=0 
is transcendental. 
Let r be an integer, r > 2, and write 


S,; ={ke€N|& has an odd number of digits in base r} . 


Prove that 1 
a= — 
De 
kES,, 


is transcendental for any integer m > 2. 
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6.6 


6.7 


6.8 


6.9 


6.10 


Let 


f(z) = 2+ 22? +223 + 424 + 42° + 42% + 427 + 82% + 829 4+... , 


where the coefficients consist of a 1, two 2s, four 4s, eight 8s and so on. 
Prove that if ¢ is algebraic and 0 < |¢| < 1, then f(¢) is transcendental. 


Consider 
f(z) = zt 227 4+ 223 + 324 4 325 + 32% 4 327 + 429 4 4294... , 


where the coefficients consist of a 1, two 2s, four 3s, eight 4s and so on. 
By relating f(z) to a function we have already studied in this chapter, 
show that f(¢) is transcendental for non—zero algebraic ¢ inside the unit 
circle. 


Let 


f(z)=2?7—-2— 24422428427 


Zz Zz Zz Zz Zz ze + Z pg ee. 


the signs (starting from the z? term) consist of a plus, 2 minuses, 3 
pluses, 6 minuses, 9 pluses, 18 minuses and so on. Prove that if ¢ is 
algebraic and 0 < |¢| < 1, then f(¢) is transcendental. 


Suppose that f(z) is a function which is analytic for |z| < 1 and satisfies 
the functional equation 


f(A) = (1+ 2) FF) . 


with f(0) = 1. Suppose further that f(z) is a transcendental function, 
and let ¢ be a non-zero algebraic number with |¢| < 1. Prove that at 
least one of f(¢) and f(¢?) is a transcendental number. 


Let S be the set of non-negative integers whose binary representation 
includes (at least) two consecutive ones. That is, 


S = {11, 110, 111, 1011, 1100, 1101, 1110, 1111, 10011,...} 
= {3, 6, 7, 11, 12, 13, 14, 15, 19,...} 


in binary and decimal notation respectively. 
(a) Construct a DFA which accepts S. 


(b) If f(z) is the characteristic function of S, find a linear system of 
functional equations relating the values of f and other functions 
at z to their values at z?. Use as few other functions as you can. 
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APPENDIX 1: ALPHABETS, LANGUAGES AND DFAS 


An alphabet is a finite set. If © is an alphabet, then a word over © is any 
finite sequence of elements of ©. A word is customarily written with neither 
brackets nor commas, w = @1@2:::@p where the symbols ax are elements of 
Xu. The length of the word just described is n. There is a unique word « of 
length zero, called the empty word. We denote by ©* the set of all words 
over ©. A language over © is a subset of *. 


A deterministic finite automaton M over an alphabet © is a quadruple 
(Q,q,5,F), where 


e Q is a finite set consisting of the states of M; 

e q is an element of Q, the initial state; 

e 6 is a function from Q x © to Q, known as the transition function; 
e F CQ is the set of final or accepting states. 


The extended transition function 6* : Q x &* —> Q is defined inductively: 
5*(q,e)=@ and 5*(q,wa) = 4(5*(g,w),@) 


for allg€é Q, we d* anda €®™. 


A word w € %* is said to be accepted, or recognised by M if and only if 
6*(q1,w) € F, and the subset of ©* given by 


L(M) = {w € d* | w is accepted by M } 


is the language accepted by M. 


APPENDIX 2: SOME RESULTS OF COMPLEX ANALYSIS 


A2.1_ Taylor series and analytic functions 


If a function f is analytic in an open disc D with centre zo, then it has a 


Taylor series 
oo 


S/ ax(z — 20)* , (6.19) 


k=0 


with a, = f*)(zo)/k!, which converges to f(z) in D. Conversely, if a series 
such as (6.19) converges in an open disc D with centre zo, then it represents 
a function f(z) analytic in D; moreover, the coefficients a, are precisely the 
Taylor series coefficients of f. 


Proof. See, for example, Churchill and Brown [19], sections 57, 65 and 66. 
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A2.2 Limit points of roots of an analytic function 


Suppose that f(z) is analytic in a disc D C C, and that f has a sequence of 
roots Wo, W1, W2,... which converges to a limit w in D. Then f is identically 
zero in D. 


Proof. The zeros of an analytic function, other than the zero function, are 
isolated ({19], section 75). But under the stated conditions f is analytic and 
therefore continuous at w, and we have 


f(w) = r( lim wr) = lim f(we) =0. 
k—- 00 k—- oo 
Thus w is a root of f and is not isolated since every neighbourhood of w 


contains a point wz which is also a root of f. Hence f is identically zero, as 
claimed. 


A2.3 Estimation of power series 


The absolute value of a (convergent) power series can be estimated from its 
first non-zero term alone. More precisely, let ¢ > 0 and suppose that the power 


series 
[oe) 
) az” 
k=K 


converges in the disc |z| < R+¢. Then there is a constant c such that for all 
z with 0 < |z| < R, we have 


co 
| ; ape” 
k=Kk 


<clz|*. (6.20) 


Note that the constant c does not depend on z, but may depend on R and 
on the particular power series we are considering. 


Proof. Denote the given series by f(z). Since it converges for |z| < R+<, so 


does the series 
co 
) age” : 
k=K 
But as we have seen above, a convergent Taylor series represents an analytic 


function; moreover, an analytic function on a closed disc |z| < R is bounded 
([19], section 18); so for all z in this disc, we have 


[oe} 
Ife) = lel™ | SO ane**| < eet 
k=K 


as claimed. 
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Example. For the well-known series 


9° kel 
. — —| vi 
sin z d| ) Ok+ii 


we have kK = 1. The series converges for all complex z; choose, for instance, 
R=3. If |z| < R, then 


= _ sinh3 
= — 3392-55, 
=[-0 SSN cool < a oy 3 Sneied 


sin 2 


Thus 
|sinz| <4|z| whenever 0<|z| <3. 


Note that the familiar real inequality |sin z| < |z| is not generally true for 
complex z. 


A2.4 Algebraic and transcendental functions 


To show that the function f defined in equation (6.1) on page 149 is transcen- 
dental, we may use two of many results on this topic to be found in George 
Pélya and Gabor Szegé’s classic text Problems and Theorems in Analysis [51], 
Part VIII, Chapter 3, section 4. 


Lemma 6.7. Suppose that the coefficients of the power series 


Co 
z= S- anz* 
k=0 


are integers and have only finitely many different values. Then f is a rational 
function if and only if the sequence of coefficients is eventually periodic. 


Comment. Once again we see a strong analogy between rational numbers 
and rational functions. 


Lemma 6.8. Let f have a power series with integral coefficients. If f is an 
algebraic function but not a rational function, then the series has radius of 
convergence strictly less than 1. 


Corollary 6.9. The function f that we considered on pages 149-156 is tran- 
scendental. 


Proof. The power series coefficients of f are 


1 if k is a power of 2 
( ——— 
*~)0 — ifnot, 
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so the first lemma implies that f is not a rational function. But the ratio 
test shows that the series for f has radius of convergence equal to 1, and 
therefore by the second lemma f cannot be an algebraic function. Hence f is 
transcendental. 


In the present case the non-vanishing of EF can be proved by conceptually 
simpler, though more laborious, methods: we may expand the sum defining 
E(z), and the increasingly large gaps in the series for f(z) will enable us to 
explicitly identify a non-zero term in the expansion. 


Let p be maximal such that a,(z) is not the zero polynomial, and let q be 
maximal such that the coefficient of z? in a,(z) is non-zero. Choose r such 
that s < 2"; the significance of this choice will appear later. Expanding 

dp (z)F (2)? = (apo + +++ + Opg24)(2 +27 +28 +28 +)? (6.21) 


we obtain, among others, a term 


r ortl or+2 raps 
p! OpgZ4 ak aa rs 


(6.22) 


where the p! is due to the p! different orders in which p different terms 2" can 
be chosen from the p repetitions of the second factor in (6.21). The coefficient 
of this term is not zero. We shall show that 


e if we expand the expression (6.21) there are no terms with the same 
exponent as (6.22), except for those we have already counted; and 


e if we expand aj;(z) f(z)! for 0 < j < p there are no such terms at all. 


It will follow that the term (6.22) in the power series of F(z) is not cancelled 
by any other term, and so E(z) is not identically zero. Notation. We shall 
write S; for a sum of j powers of 2. 


So, firstly, we expand (6.21) and seek terms in z*, with 
Reagbe SPP ig Oe =y +2? 2° : 


Any such term has an exponent of the form gq’ + S, with 0 < q' < q. So we 
require q’ + Sp = q+ 2"*? — 2", that is, 


Sp =2"*P? 2" +q-q'. 
Since 0 < q’ <q <5 < 2", this equality implies 
ote org, < orp, 
But if S,, < 2"t?, we have 
Sp <2rtp-1 4 orte-24.... 407 ortp_orc gs, , 


and so the only solution is the one we have already. Similarly, if we expand 
a;(z) f(z)? we want 
qd +8; =g+Qrtp_—9ar 
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with 0 <q <sand0<j<p-—1.Now 
-2"<-s<q-qd<s<2" 


and so 
grte _ortl — ortp_ or _of < §; <27tP ; 


as above we have 
S; < Send < grtp-l ae gr+p—2 oe Yee ake grt = 9rtp _ grt ; 


and the contradiction between this inequality and the previous one shows that 
no term can be found with exponent the same as that in (6.22). Therefore, 
E(z), upon expansion, contains a term (6.22) which is not cancelled out by 
any other term, and so E(z) does not vanish identically. 


Comment. It is comparatively simple to show that f is not a rational func- 
tion. Suppose that f(z) = p(z)/q(z), where p and gq are polynomials of degrees 
m and n respectively. Then the functional equation (6.2) can be rewritten as 
the polynomial identity 


p(z)q(2?) — p(z”)a(z) = za(z)a(z) ; (6.23) 


the three terms have degrees m + 2n, 2m +n and 3n+ 1 respectively. 


e Ifm+2n > 2m+n, then n > m and so 3n+1 > m-+ 2n. Thus the 
right-hand side has greater degree than the left. 


e Ifm+2n < 2m+n, then n <m,son+1< mand 3n+1< m+ 2n. 
In this case the left-hand side has greater degree than the right. 


Thus no equation such as (6.23) can hold, and f is not a rational function. 


For an alternative proof, once again suppose f(z) = p(z)/q(z); without 
loss of generality we may assume that the polynomials p and q have no common 
roots. From the identity (6.23), we have for any a € C that 


qa) =0 ifandonly if g(a*)=0. (6.24) 


It follows that any root a of q(z) must satisfy a = 0 or |a| = 1, as otherwise 
q(z) would have infinitely many roots a,a’,a*,.... If a = 0, then 0 is, say, 
an n-fold root of the left-hand side in (6.23) and a (3n + 1)-fold root of the 
right-hand side: this is impossible. If q(z) has roots on the unit circle, choose 
a root e”® with minimal positive 6; then (6.24) shows that e’®/ is also a root, 
contradicting minimality. Thus q(z) cannot have any roots at all, and must 
be a constant polynomial; but then the left-hand side and right-hand side 
of (6.23) have different degrees. We have a contradiction, and so f is not a 
rational function. 


176 @ Irrationality and Transcendence in Number Theory 


APPENDIX 3: A RESULT ON LINEAR EQUATIONS 


If m <n, then a homogeneous system of m linear equations in n unknowns, 
with coefficients in a field Q, always has a non~zero solution in Q”. 


Proof. Any system of linear equations over Q has no solution, a unique solu- 
tion or infinitely many solutions in Q”. In the present case the system, being 
homogeneous, cannot fail to have solutions; and since the number of variables 
exceeds the number of equations, the general solution must contain one or 
more parameters. So the system has infinitely many solutions, and hence at 
least one nonzero solution. 


cHapTER / 


Lambert’s Irrationality 
Proofs 


Proving that the ratio of the diameter of a circle to its 
circumference is not rational will not surprise geometers 
... but what merits more attention, 

and will be rather a greater surprise, is that 

if the ratio of an arc of a circle to its radius is rational, 
then the ratio of the tangent to the radius is not. 


J.H. Lambert [37] 


ifs THE 1760s J.H.Lambert proved the irrationality of 7, e and related 
numbers by means of continued fractions. Roughly, the methods used are 
those we used in Chapter 4 to find the continued fraction of e; however, more 
general continued fractions must be employed. For numbers related to e and 
to a we shall need expressions of the forms 


bi by 
and a4 = be 
bs bs 

ag + ——— ag — ———— 

age: ree 
respectively, where in each case the successive numerators and denominators 
will be positive integers. In the first case the convergence of the (infinite) 
continued fraction, and the behaviour of its convergents, follow patterns quite 
similar to those for simple continued fractions, and all the properties we shall 
need are proved with little additional difficulty. The second case is significantly 
harder. The main problem is to ensure that the continued fraction converges 
— not only in the infinite case, but also in the finite case, for it is by no means 
clear that attempting to calculate an expression such as 


bt 
be 


ag = 


DOT: 10.1201/9781003111207-7 177 
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will not at some stage lead to a zero denominator. In fact, finding and proving 
general conditions which ensure that such expressions are meaningful turns out 
to be so messy that we shall use the continued fractions only as an inspiration 
for defining certain sequences which will correspond to convergents. 


We begin by using the functions f(c; z) from Chapter 4 to develop contin- 
ued fractions for tan z and related functions. First, recall that 


(2k)! (2k + 1)! 


(Q) = pak.) and (3) = 2k +1)(3) = Q2kky 


Using these expressions as on page 97, we have 


f(4;427) =coshz and zf(3;4z7) =sinhz ; 


274 274 
similarly, 
oo (=1)"2" eas) (=1)"2" 
f(i-—42") = = —_——_ = cos z 
== » (FP a2ek! << (2k)! 
ane k 2k k 2k 
x (=i )"2* = f= 1yeeehtt . 
zf($;-427) =z = + ——— =sinz. 
2 d (3)(#) 22k! = (2k +1)! 


The identity 


Zz 
fp Ve 1: 9. 
fle2) = fle+ 1:2) + TAT Het 22) 
from page 95 can be rearranged as 
Fiet he) __ 1 ; 
2efla2) 5, 42 Feta’ oe) 


2c+ 


2c+2 f(c+1;z) 


iterating this expression (and ignoring, for the time being, any convergence 
problems), gives the continued fraction 


f(c+1;z) 1 Az 4z Az 


Qcf(z) - 2e+ Zt2+ 2Z+4+ W2%+6+ -:-” 
Now taking c = 4, replacing z by qe" and multiplying both sides by z yields 


z 2 2 2 


1+ 3+ 5+ 7+-::: 
which agrees with Chrystal [18], Part II, Chapter XXXIV, section 21, equation 
(16). We can employ a similar procedure to find the relevant continued fraction 
when z is written explicitly as a rational number s/t. Divide both sides of (7.1) 
by t to give 


tanh z = 


7 


L flies 12) 1 
4t?z  f(e+2;2z) ” 
AY oe ONE Heald) 


2ct f(cz) 
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iterate and multiply by s to yield 


s f(etljz) 
2ct Gz 
8 At“z At“ z At*z 
~ Qct+ (2c4+2)t+ (2c+4)t+ +--+ (Qce+2k—2t+o, ’ 
where 
_  4tz  f(e+k+1;2) 
Ok Qce+2kt fletkz) | 
Finally, take c= $ and z = s?/4t? to obtain 
8 = 2 Se 
tanh — = —— ———_ — : : 
OE Be SEE SE Tesh 3k es) 


We repeat that to do this properly we have to prove that the right-hand side 
of (7.3) does actually converge, and, moreover, that it converges to the value 
claimed. By very similar calculations (just replace z by —}z? instead of $27), 
and subject to the same cautions, we have 


z 2 2 2 s s s? s? 8? 


and tan -— = —— 


1 3 i) Toe t t 3t ot Tt 


tan z = 


7.1. GENERALISED CONTINUED FRACTIONS 


Now we need to work out the theory of continued fractions having the form 
of the first expression on page 177; we must pay particular attention to the 
question of convergence of infinite continued fractions. We shall assume that 
all a, and by are positive real numbers (later we shall assume further that 
they are in fact integers); it is then clear that any finite expression 


by be br 
aj+ agt ::: ak 
is meaningful, and we define 
bi bn ib . bi bg bk 
= lim —, 
a+ Ag+ a3+t -++ k+00 G+ Ag+ *': ay 


provided that the limit exists. The first few truncations of the finite continued 
fraction are 


by by be = aby by bo bs = a3azb1 + b3b, 
at , aj+ ag a2a, + bg , ay+ ag+ a3 430201 + agb + b3a4 


and so on, from which it is fairly easy to guess that we should define 


Pk = 4kPk-1 + bePr-2 and = qx = Akdk—-1 + bEGR—2 
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with the initial conditions p_; = qo = 1 and po = q_; = 0. Note that if every 
by is 1, then these formulae coincide with those for the convergents of a simple 
continued fraction with ag = 0. 


Now let x, like ay, and by, be a positive real number. By methods essentially 
identical to those of Chapter 4 we may show that 


by Be-1__ OK by bp-1 _ ©Pr-1 + bePr—2 
ay + +++ G@p-it+ 2 ay + +++ Apr tbe/x = xqx-1 + dk gk-2 


and hence that 
bbe bk ik 


: (7.4) 
ay+ ag+ -:: dz dk 
A closely related result which will be of use in the future is 
by be-1 be (Qe +) Pe-1 + OePR-2 _ Dk + EPR-1 
Qy+ +++ deat agte  (agt+2)qr-1t+begr-2 Me + GR-1 | 
It is also easy to prove by induction that 
Pk-19k — Pegk—1 = (—1)* be bdp—1 +++ br - (7.5) 


Since az, and by are positive, the convergent 


Pk _ GkPk—1 + bk PR—2 


dk Ak Gk—-1 + OR Gk-2 


lies strictly between pg—1/qe—1 and pr—2/qp—2; it is easy to check that the 
second convergent is less than the first, and so 


Sie ee, (7.6) 
q2 4 © 1 4 41 


Now suppose that a, and bz are positive integers, and that a, > by for suf- 
ficiently large k, say for all k > K. It is then clear that whenever k > 3, we 
have qx—1 > qx—2 and so gp > (ax + be)qe—2; Moreover, 


dk > (2bg)dk—2 > (2b,) (2bp—2)dk—4 sa (2b,) (2bp—2) EI 


where the last factor in the product is either 2bxK42q¢K or 2bK41qK-1. Hence 
for k > K, we have 


__ debp—a br byba—a ++i _ Kb sb 


Pk _ Pk-1 
dkUk—1 2'-K by be_-1---OK419K9K-1 2"qKaK-1 


qk dk-1 


since K is a fixed number, the right-hand side tends to zero as k + oo. 
Combining this with (7.6) and (7.4) proves the first part of the following 
result. 
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Theorem 7.1. Irrationality of a generalised continued fraction. If a,,a2,... 


and b;,b2,... are positive integers with a, > by for all sufficiently large k, 
then the infinite continued fraction 
b b b 
Pa eee kee a Racca (7.7) 


ay + ag + ag3+ cc 


converges to an irrational limit. 


Proof. We have shown that the required limit exists; it remains to prove this 
limit irrational. So, suppose that a is rational. For k > 1 write 


br be+4 bee 
Akt Ak41 + Aky2r + 


Ak = 


and note that if a, is a rational number p/gq, then 


bk, bk — anp 
ky. = — — ap = —— (7.8) 
Ok Pp 
is also rational. Since a; = a is rational, so is every ax. But for sufficiently 
large k we have ax > by, and the inequalities (7.6) yield 


br 
Ak < ag <1; 
ak 
so in (7.8), we have p < q, and the denominators of the a, eventually form 
an infinite decreasing sequence of positive integers. This is impossible, and so 
q@ cannot be rational. 


Comment. The condition that a, > by for sufficiently large k is by no means 
necessary in order that the continued fraction converge. For example, Chrys- 
tal [18], Part II, Chapter XXXIV, section 14 shows that the continued frac- 
tion (7.7) converges if the series 


foe) 
. Qk-14k 
bx, 


k=2 


diverges. It follows from standard series convergence tests that the continued 
fraction converges if 


or if 


Apn+10 
as. 


lim 
k—-0o Gp—1bK41 
Chrystal also gives a necessary and sufficient convergence criterion: the con- 
tinued fraction (7.7) is convergent if and only if at least one of the series 
a1 , a3b2 , asbaby | azbebabe 


ab, a4b3b 1 agbsb3b1 
Re COE cee ath 06050801 
bi 3b, b5b3b1 75 b3b1 bo baby bebabe 
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is divergent. Observe that (7.6) holds whenever all a, and by are positive. 
Therefore, it is impossible for a continued fraction to be unboundedly diver- 
gent; it will fail to converge only if 


In this case the continued fraction is said to oscillate. By Chrystal’s criterion 
above, a simple example of an oscillating continued fraction is 
2 4 8 16 
1+ 1+ 14+ 1+:--¢ 
It does not seem easy to evaluate the two limit points of this continued fraction. 
However, there is a very simple result for a closely related expression: for the 
continued fraction 


1+ 14+ 1 1 1+ 
we have 
lim @£—1 and lim Met 9 ; 
k-oo 2k k—00 Q2k+1 


as shown in David Angell and Michael D. Hirschhorn [7]. 


7.1.1 Irrationality of tanh r 


To prove the irrationality of tanhr for non-zero rational r we must consider 
the question of convergence in the derivation of the continued fraction (7.3). 
Taking, again, c = 4 and z = s?/4¢? in (7.2), we have for each k an identity 


2 2 2 


‘ Ke Ss s Ss S 
anh — = —— —— — ——_———_ 
t t+ 3+ 5t+ --- (Qk—-Dt+ox 


where 

spl + 338/402) 
(2k +1)t f(k + 5; 8?/4t?) ” 
It is clear from the defining series that f(c;z) is positive whenever c and z 
are positive; therefore every o; is positive, and the following theorem may be 
applied. 


On = 


Theorem 7.2. Let a € R, let a1,a2,...,b1,b2,... € R*, and suppose that 


there exist positive real numbers 01,02,... such that 
b b b 
oS. (7.9) 
ay + Gat ++: Akt OR 
for each k > 1. If the infinite continued fraction 
b, bang 


a+ ag+ agt+ ::: 


converges, then its limit is a. 
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Proof. Suppose that the continued fraction converges to 3; that is, the con- 
vergents pr/qz tend to GB as k + co. Then from 


bi be bk Pe + ORPR-1 
a+ Gg+ +++ Gk +n dk + OKGk-1 


we see by using the result in appendix 4.1 (repeated at the end of the present 
chapter) that for every k the constant a lies between p,./qy and pr—1/qk-1- 
By the Sandwich Theorem a = {, which is what we wanted to prove. 


Corollary 7.3. If r is rational and not zero, then tanhr is irrational. 


Proof. Let r = s/t be a non-zero rational; since the hyperbolic tangent is an 
odd function, we may assume that s > 0. Clearly (2k—1)t > s? for sufficiently 
large k, so by Theorem 7.1, the continued fraction 
a oe s? ve 
t+ B3t+ Bt+ 7t+ + 


converges to an irrational limit. But by the theorem just proved and the 
remarks preceding it this limit is tanhr; the result follows. 


Corollary 7.4. Ifr is a non-zero rational, then e” is irrational. In particular, 
e is irrational. 


Proof. Use the previous corollary and the identity 


e’ —1 
eT+1 0 


tanh — = 
an 5) 


Corollary 7.5. [fr is a positive rational with r #41, then logr is irrational 
(where, as usual, log denotes the natural logarithm). 


Proof. Use the relation 


r?—1 
tanh(logr) = oxi 
Comment. Theorem 7.2 makes it appear that any choice of partial numer- 
ators and denominators is possible in the generalised continued fraction of a 
given real number. This is not quite true, as a bad choice will force a, < 0 
for some k, which is not permissible. Nonetheless a wide variety of generalised 
continued fractions exists converging to a given limit. If, however, the numer- 
ators b, are given, then the a, are more or less uniquely determined, as the 
following result shows. 


Proposition 7.6. If a is irrational, if the by are given integers, and if we 
demand that the ay, are integers such that for each k a relation of the form (7.9) 
holds with 0 < on, <1, then the ax are uniquely determined. 
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Proof. We have a = b;/(a; +01) with 0 < 01 < 1; so 


by 
ay <ayt+o,=—<ajt+l 
a 


and a, is the integer part, a; the fractional part, of b;/a. Similarly, az+41 is 
the integer part of bg41/o% for each k > 1. 


Example. With the assistance of the computer algebra system Maple we find 


2 3 A 5 6 7 68 9 10 11 


1+ 7 16+ 10+ 19+ 8+ 8+ 20+ 31+ 864 -:-: 
5 3 5 8 26 647 712 1019 3233 


3+ 54+ 8+ 264+ 6474+ 712+ 1019+ 3233+ 218484 --- 
2 4 8 16 32 64 128 256 512 


1+ 9+ 124 894 1254+ 1114+ 1424+ 7584+ 5414+ .---° 


7.2 FURTHER CONTINUED FRACTIONS 


To prove the irrationality of tanr for non-zero rational r we study generalised 
continued fractions having the form of the second expression on page 177. 
As remarked earlier, it is difficult to ensure rigorously that such expressions 
converge, even in the finite case, so we shall work directly with an appropriate 
sequence of fractions px /qe- 


Given two infinite sequences a), @2,... and bj, b2,... of positive real num- 
bers, we define 


Pk = QkPr—-1 — OePR-2 aNd = Qe = AkGu—1 — DK GK-2 
with initial conditions pp = g_1 = 0, p-1 = —1 and q = 1. Informally 


, b b: by 
a a (7.10) 
dk a, — ag— +": ak 


though we shall not actually seek to prove anything like this. Our aim is to 
show that 


e under suitable conditions on a, and by, the quotient px/q, tends to an 
irrational limit as k + oo; 


e certain specific values of az, and by satisfy these conditions; 


e for these specific values, the limiting value of pz, /q, is tanr, where r is 
a given non~zero rational number. 


We begin by making the following assumption, which will remain in force until 
it is superseded by a weaker assumption in Theorem 7.17. 
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Assumption. For all k, the denominators a, and numerators by are positive 
integers with a, > by, +1. 


Comment. In fact, the assumption a, > by suffices to prove many properties 
of the continued fractions (7.10); however, the above assumption is sufficient 
for our purposes and simplifies our arguments. 


First, we ensure that, subject to the above assumption, the denominator 
dk never vanishes. 


Lemma 7.7. For each k > 0, we have qx > qp—1 > 0. 


Proof. The case k = 0 is immediate from the definition; proceeding induc- 
tively we have 


Gk = OrGr—1 — bege—2 > (be + VL) de—1 — be qn—2 
= Qe—-1 + dn(Qe-~1 — Ge-2) > Ge—-1 = 0 


fork > 1. 


Corollary 7.8. If k > 0, then qx is strictly positive; moreover, qx > © as 
k > o. 


The following lemma is easily proved. 


Lemma 7.9. [faz and by are positive real numbers, and py and qx are defined 
as above, then 


Pkdk—1 — Pk—19k = Debp—1- ++ by 
for each k > 1. 


Corollary 7.10. The fraction px/q, increases monotonically with k. 


Lemma 7.11. Let ry = q, — pr. Then ry > re—1 for each k > 0. 


Proof by induction. The basis of the induction is clear. If we take k > 1 and 
assume that the result is true for 1,2,...,4—1, then we have the inequalities 
Tk-1 > Tk-2 2 °+' > T-1 = 1; hence 


Tk = Are — Oerp—2 = (be + 1)re—1 — Oere—-1 = Te-1 
and the induction proceeds. 
Corollary 7.12. [fk > 0, then pr/qx <1. 


Corollary 7.13. As k — oo, the fraction px/qx tends to a limit a with 
O0<a<l. 


Next we wish to show that the limit a is irrational. In effect, we shall do 
this by writing 
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and mimicking the proof of Theorem 7.1. However, in order to avoid any 
convergence problems we once again refrain from employing this sort of con- 
tinued fraction, and instead work directly with pz /q, and related fractions. We 
denote by pm, and dm,~ the numbers satisfying the same initial conditions 
and recurrences as pz and qx, except that the partial numerators b,, be,... 
and denominators aj,a2,... are replaced by bm, bm4i,-.. and @m,@m-+41,--- 
respectively. That is, 


dm,-1 = 0, dm,0 = 1, dm = Am, Im,2 = Am4+14m — bm-+1 


and so on. Specifically, the recurrence for pm,p is 


Pm,k = 4m+k—1Pm,k—-1 — bm tk—-1Pm,k—-2 5 


and that for dm,z is similar. Clearly pi, = pr and qi,x = qe for all k. Since 
the partial numerators and denominators defining pm,_ and gm,~ are positive 
integers satisfying a, > b, +1, the properties proved above for pz and q, also 
hold for pm,~ and Gm,~- In particular, the quotient pmk/dm,~ tends to a limit 
Qm as k + oo, and this limit satisfies 0 < a, < 1. 


The proof of the following lemma is an easy induction on k. 


Lemma 7.14. For all integers m > 1 and k > —1, we have 


Pm,k+1 = bmdm+1,k and dm,k4+1 = 4mdm+1,k — Pm+i1,k - 
Corollary 7.15. For each m > 1, we have 
bm 


OK SS — 
Am — Am+1 


Proof. By the arguments given for pz and qx, all Pm,~ and dm,z are positive 
for k > 1. From the lemma, we have 


Dm, k+1 Pm-+1,k bmOm+1,k Cm dm+ik — Pm+ik _ p , 
ae (an — |) = eee eS oe = dn 5 


Qm,k+1 dm-+1,k AmIm+1,k — Pm+1,k Qm+1,k 


taking the limit as k — oo of each side yields Qm(@m—Qm+1) = bm, and since 
bm 4# 0 the desired equality follows. 


Theorem 7.16. The limit a of px /qx ts irrational. 


Proof. First, observe that since @+41 < 1, we have 


bm bm 
< 


ee . 
Am — Am+1 Adm —1 


Am = 


and so in fact each a, is strictly less than 1. For any m, if a», is a rational 
number p/q, then 


bm Pam — dom 
Am+1 = am — = 
Am Pp 
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which has smaller denominator than a,,,. If ~@ = a is rational, therefore, the 
denominators of the a,, constitute an infinite decreasing sequence of positive 
integers. This is impossible, and so a must be irrational. 


We now have to prove the existence and irrationality of a without assuming 
that ax > by +1 for all k. 


Theorem 7.17. Suppose that ax and by are positive integers for all k > 1, 
and that a, > by +1 for all sufficiently large k. Then 
Pk 


a= lim — 
k-0o dk 


exists and is an irrational number. 


Proof. Suppose that the inequality relating a, and by holds for all k > K. 
Since the numerators bx, bxK41,... and denominators ax ,axK4+1,... all satisfy 
this inequality, our previous results show that px,./qK,x tends to an irrational 
limit ax as k > oo. By induction on k we can show that 


Pm,k = —Pm,lPm+é+1,k—€-1 + Pm,e+1 Im+é+1,k—-£-1 5 (7.11) 


Am,k = — Im, Pm+l+1,k—-0-1 + Im l4+1 Imtl41,k-l-1 


form > 1 and 0 < ¢ < k; the induction is facilitated by the observation 
that the relations are trivial for @ = k and for €= k — 1. Taking m = 1 and 
£ = K — 2, we have 


Pk = —PK-2PK,k—K+1 + PK-19K,k—-K+4+1 
for k > kK —1; since qx,4—-K+41 is not zero, 


Pk PK k-K+1 
——— = -PK-2 ———— + PK-1 - 
qK,k-K+1 dK,k-K+1 
Treating the second equation of (7.11) in the same way, and then letting k 
increase without bound, we find that 
dk 


Pk 
—- > px-1-QKpK-2 and —— +> qK-1- aKqK-2 
qK,k-K+1 dK k-K+1 


as k > co. 

Now qdx_-1 — @xqdK—2 is not zero. If it were, then the irrationality of ax 
would give qx_-1 = qK-2 = 0, hence qx_—3 = 0, and eventually gg = 0, which 
is false. Consequently qx, is non—zero for sufficiently large k, for if q, were zero 


for infinitely many / then q,/qK,~.—K+1 would have a limit of zero, or no limit 
at all. Therefore, for sufficiently large k we have 


Pk Pk/QK,k-K+1 = PK-1— KPK-2 
dk Qk /QK,k—-K +41 qdK-1 — AKdkK-2 , 
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and this is the desired limit a. To show that a is irrational, suppose otherwise; 
then the relation 

aK (PK-2 — AGK-2) = PK-1 — O9K-1 , 


with the irrationality of ax, implies that px_1—aqxK_-1 = 0 = px-2—-—aqK-2. 
But this implies 


0 = (pK-1 — QqK-1)9K-—2 — (DK-2 — A9GK-2)9K-1 


= PK-19K-2 — PK-29K-1 


= be_-1bK-2-+-b1 , 


which is impossible since each bz is non—zero. We conclude that a is irrational, 
and the proof is complete. 


7.2.1 Irrationality of tanr 


Now let r = s/t be a non-zero rational number; since the tangent function 
is odd, we may assume that s and ¢ are positive integers. From our informal 
derivation of the continued fraction for tanr on page 179 it is apparent that 
we should define partial numerators and denominators by 


8 fork =1 


=(2k-—1)t fork >1 and bk = 
an = ( ) ri . : i for k > 2. 


Consider the three aims stated on page 184. It is clear that a, > bh + 1 
for sufficiently large k, and we have shown that in this case pz/qz tends to 
an irrational limit as k — oo. It remains to prove that this limit is tanr. 
Rephrasing the definition of p, and q,, we have 
po=0, Pi=S, Proi =(2k+1)tpp — spe-1 , 
go=1, H=t, gre =(2k4+1)tqn — 87 qK-1 


for k > 1. 


In order to show that tanr is the limit of pz/qx as k 4 oo, we write the 
difference as a fraction, 


Pk dk SINT — pe COST 
tanr — — = —————_ 
dk dk COST 


and begin by looking at the numerator. 


Lemma 7.18. [fk is a non-negative integer, then 


co PD 
9k (m sh k)! g2m+2k+1 
: = S m 
een Le ) m! (2m +2k +1)! c2m+e+1 


Moreover, qx sinr — pp cosr + 0 as k + oo. 
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Proof. The series formula is proved by induction on k. It is not too hard to 
check that the result is true for k = 0 and for k = 1; suppose it is true for 
some integer k > 1, and also for k — 1. Then we have 


Qe. sinr — pei cosr = (2k + 1)t(q, sin r — py cos) 


— s*(qx-1sin r — pp_1 cor) 


= So (-1)"(2k +1) a 
m=0 mi (Qm+2k+1)! 2m+rk 

ge mth Dl ree 

- 7g y" yal (Om + 2h—1)! —pmtk 


m=0 


Writing this expression as a single sum yields 


a mn 2'-1 (m +k)! 2(2k-+1) (2m + 2k + 1)(2m + 2k)] s?™+2+1 
— m! (2m + 2k +1)! mt+k mth ° 
and upon simplification the factor in square brackets is just —4m. Therefore, 
we may drop the m = 0 term and then shift the summation index to give 


co gk+l1 (m +4 k)! g2mtek+ 
. —H; = =: pre FE 
Gk+1 SINT — Per+i COST . ) (m _ 1)! (2m 429k + 1)! p2m+k 
Oe saa (m +k+ 1)! g2M+2k+3 
a => ml! (2m+2k+ 3)! p2mrkt2 ° 


m=0 


By induction, the identity is true for all k > 0. Now the ratio of successive 
terms in the series has absolute value 


(m+ k at 1)! g2Mt2k+3 (m ti k)! g2mtr2k+l 
(m+ 1)! (2m + 2k +3)! 2mtk+3°/ ml (Qm+2k+ 1)! ember 
2 2 


s s 


~ TES EES) @ ~ ake 


Thus for large k (specifically, for k > s?/4t?), we have an alternating series 
in which the terms, right from the very first, are monotonically decreasing in 


absolute value. Therefore 
Qk kl g2kt1 <3 (s?/t)* 
(Qk +1)! tht tk! : 


0 < qsinr — ppcosr < { first term} = 


The right-hand side tends to zero as k — oo, and so qx sinr — pp cosr > 0, 
as claimed. 


We also need to know what happens to q, for large k. We showed in 
Corollary 7.2 that gq, — co as k > oo; but this was proved only under the 


190 @ Irrationality and Transcendence in Number Theory 


assumption that az, > b,+1 for all k, which need not be the case here. Indeed, 
if we take s = 2 and t = 1 it is easy to calculate qqg = —1 and q3 = —9; for 
k > 3 we have (2k +1)t > s? > 0, so 


dk+1 = (2k + 1)tan — 87qK-1 < de 
and qx + —co as k > co. 
Lemma 7.19. Ifa, > by +1 for all sufficiently large k, then either qx — co 
or Gk 4 —00 as k > oo. 
Proof. With the notation qm, introduced on page 186, we can use (7.11) as 
in the proof of Theorem 7.17 to show that 
PK,k-K+1 


aK-2) 
dK k-K+1 


dk = UK k—-K+1 (4x1 _ 


for k > K —1. Continuing with the ideas of this proof, we see that as k + oo 
the bracketed factor tends to a non-zero limit; moreover, qK,4—-K41 — © 
since ay > b, + 1 for all k > K. Therefore, q, tends to co or —oo, depending 
on whether the bracketed quantity tends to a positive or negative limit. 


We can now complete the irrationality proof for tanr. Assume that r is 
not an odd multiple of ir. (Of course we “know” that this is true since 7 is 
irrational — but part of our current aim is to give an alternative proof of that 
same fact!) Then cosr 4 0, and by the two lemmas we have just proved, 


sin’ — pp COST 
paps eG (7.12) 
dk dk COST 
as k + oo; that is, 
. Dk 
lim —=tanr. 
k-0o dk 
Corollary 7.20. If r is rational, not zero, and not an odd multiple of 41, 
then tanr is irrational. 


Comment. We mention parenthetically that (7.12) enables us very easily 
to settle the question of whether gq, tends to co or to —oo. The expression 
tanr — pr/qe is always decreasing (Corollary 7.10) and tends to zero, so it is 
always positive; and we know that gq, sinr — pz cosr is positive; therefore gz 
tends to oo if cosr is positive, and to —oo if cosr is negative. 


Finally we observe that we can now prove 7 to be irrational, and as a 
consequence can remove the extraneous condition in the above corollary. 


Corollary 7.21. 7 is irrational. 


Proof. Suppose that 7 is rational. Then +7 is rational; it certainly is neither 
zero nor an odd integer times aT, and so by the previous corollary 1 = tan im 
is irrational. But this is false. 


Theorem 7.22. If r is a non-zero rational number, then tanr is irrational. 
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EXERCISES 


7.1 


7.2 


7.3 


7.4 


7.5 


7.6 


Given positive integers a ,a2,... and b;,b2,..., define pz, and gz by the 
recurrences and initial conditions given in section 7.1: 


Pk = GkPp—1 t+ bepe—2 and = qe = Gequ—1 + beqk—2 


with p_1 = qo = 1 and q_; = po = O. Show how px and gq, may be 
evaluated in terms of a matrix product similar to that in exercise 4.5. 


Prove that 


1+ 2+ 34+ 44+ 5+ 5 
by showing that its convergents pz /qz satisfy 


and pr = qx — (—1)*. 
Show that 


Oo w 
+ 
bo 
+ 
w 
+ 
iN 
+ 


a result originally due to Euler. 


Prove that 


1+ 1+ 14+ 14+: 
where log denotes the natural logarithm; and that 


“1 x x 9x? 25a? 
1+ 3-—224+ 5-—322+ 7-—5z7+ --- 


for |x| < 1. Use the latter to obtain a continued fraction for 7. 


Define pz and gx as in section 7.2, 
Pk = 4Pr—1 — bepe—-2 and = qe = Ggqu—1 — beqR-2 


with p_; = qo = 1 and q_1 = po = 0; assume that ax and by are positive 
integers with a, > 6, + 1 for all k. 

(a) Prove that if k > 2, then qx > (bp + be—1)dn—2- 

(b) Show that the inequality q, > 2bxqx—2 is not generally true. 


Prove that if r? is rational and not zero, then r tan r is irrational. Deduce 
that 7? is irrational. 
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7.7 For k > 1, let 


_ 8k? +1 /(2k-3)(2k—7)---\? 
oe OK (Sa). 


where the dots indicate the product of a decreasing arithmetic progres- 
sion with difference 4, continuing as long as the factors are positive. An 
empty product is taken to be 1, so that we have a, = 3. Show that 


es ee ee, _t 
ay— ag—- a3— pane : 


APPENDIX: SOME RESULTS FROM ELEMENTARY ALGEBRA 
AND CALCULUS 


A simple property of positive fractions. If a,b,c and d are positive and 
a/b is not equal to c/d, then 

atc 

b+d 


lies (strictly) between a/b and c/d. 


The following result is known variously as the Sandwich Theorem, the 
Squeeze Theorem, or the Pinching Theorem. 


Theorem 7.23. Let {ux}, {ux} and { wx } be sequences of real numbers. If 


Uk SUR Swe forallk and lim uz, = lim we =L , 
k—-00 k—-00 


then the limit as k — oo of vx exists and is equal to L. 


Corollary 7.24. Ifu,— L as k > ow, and if for each k the constant a lies 
between Up—, and uz, then L=a. 


May not music be described as the mathematics of the sense, 
mathematics as music of the reason? 

The musician feels mathematics, the mathematician thinks music: 
music the dream, mathematics the working life. 


J.J. Sylvester 


Hints for exercises 


CHAPTER 1 


1.1 
1.3 


1.4 
1.5 


1.6 


Le 


1.8 
1.11 


1.12 
1.13 


Show that (2q — p)/(p — ¢) = p/q and note that 0< p—q<4q. 


Assuming that 6, is an integer, show that q’ = ¢B8x41—q|Px41| and qa 
are both integers and that 0< qd <q. 


(a) Consider (a — b)/(./a — Vb). 


If r = p/q, where p,q are coprime, then px + gy = 1 for some integers 
z,y. Use this to show that (p/q)!/” is rational and deduce that p = m? 
for some positive integer m. When is this possible? 


We may assume that a, b,c are integers having no common factor. Now 
multiplying by W/2 we get 2c+a‘/2+bw/4 = 0, and then another similar 
equation. This gives a homogeneous system of three linear equations in 
three variables having a non-trivial solution; so the determinant of the 
system must be zero. Working it out, 


a? + 20° + 4c? — Gabe = 0, 


and following the argument for the irrationality of 2 yields a contra- 
diction. There is a much easier solution using methods from Chapter 3. 


Start with the fact that 2q?— p is a non-zero integer, so |2q? — p?| > 1. 
This problem looks forward to topics we shall discuss in Chapters 3 
and 4. 


If x is rational, then (3 — b)a and (2 — a)/z are also rational. 


Use a formula for sin 30 to find a polynomial p(z) with integer coefficients 
having a as a root. Don’t do any more work than you have to in ruling 
out possible rational values of a! Better still, do something similar for 
2a instead of a. 


(c) Use a well-known relation connecting cos? rm and cos 2rz. 


If the sequence of digits is eventually periodic with period p, then there 
is a power of 2 having digits dj dz ---dpdidz---dp---didz--+dp. But such 
a number cannot, in fact, be a power of 2. Various other solutions are 
possible. 
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1.14 The solution to exercise 1.13 relies implicitly on the fact that there is 


1.18 


ee 


a power of 2 with any specified number of digits. This is not true for 
powers of 13, so a more subtle argument will be required. 


Write the (eventually periodic) decimal of a as 
a = 0.d,dg--+dmey-+ ++ €pe1+**ep*** 


Explain why if n > m, then an, has more digits than a,, and deduce 
that Gn42p > 10an. Choose 


2 =10/2? and 0<e<min{ E[k=1,2,...,m+2p}. 
i 


(a) Denote the sum of the digits of a positive integer k by o(k). Then, 
most of the time, o(k + 1) is equal to o(k) + 1: when is it not? 


(b) Suppose that a = 0.d,d2--- is rational and that its decimal has 
period p. Choose a multiple g of p such that o(q) is not a multiple of 10, 
and choose a such that 10‘ = a (mod p) for infinitely many t. For any 
sufficiently large such t, we have 


o(10°)=1 (mod 10), o(10'+¢)41 (mod 10), 


which means that the sequence { dmp+a}m>o contains each of two dif- 
ferent values infinitely often. But this is impossible. 


Using the Maclaurin series for cos a gives (2n)!cos1 = N + R, where N 
is an integer and 


1 


(2n + 1)(2n + 2) 


Suppose that b = ae + ce~! with a,b,c € Z; we may assume that a > 0 
and c #0. Ifc > 0 and n is odd, we have 


bn! — a(n! +--+ +1) —c(n! —----—1) = arn + cs, 
with 

n! n! n! n! 
Git!” eo” | OP Ge! el 


this gives 0 < arn + C5, < (a+c)/n. 


Suppose that a = p/qg. Explain why there exists n such that gig2---9n 
is a multiple of gq; then write gig2--- gna as an integer plus a remainder, 
and use the standard type of argument. 
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1.22 Use the exponential series to write A = (2n)!a/2"T! as an integer plus 
a remainder. Then find n such that A is also an integer. This idea is due 
to D.W. Masser. 


1.23 All dihedral angles of a regular tetrahedron, and of a cube, are the 
same: cos + z for the tetrahedron, 5 for the cube. Problem 1.12 tells us 
something useful about rational values of cos x. 


1.24 Suppose that x > y > 0. If 7 € Q and y? € Q and z/y ¢ Q, then 
(z—-y)* €Q 
Euclid also has the assumption x«?/y? € Q, but this clearly follows from 
the first two hypotheses. 


CHAPTER 2 


2.1 Use integration by parts twice to show that 


2 ee ee 


In, 
nT T : 


with initial conditions Jy = 0 and I, = 4/7”. Supposing that 7? = a/b, 
define J, = a"I,,/n!; prove that J, is always an integer, that J, is 
non~zero for infinitely many n, and that J, tends to zero as n + oo. 


2.2 We need to prove that if 7 = a/b, then 


n/2 
i (a? — b?x)” cos x dx| > 
0 


/ (a? — b?x?)" cosa dz] . 
m/2 


Show that this is a consequence of the fact that 0 < « < m—ax whenever 
<2 < 4: 


2.3 Suppose that 7,/c = a/b and take 
f(a) =a"(a—byex)” . 
As in the proof of Theorem 2.4 we obtain 
I=F(r)+F(0) with F(x) = f(x)—-f"(a)+---. 


Noting that f(x) = f(a — x) gives the useful simplification I = 2F (0). 
Taking a bit of care (since f does not have integral coefficients) we can 
show that if n,k are even and k > n, then f)(0) is an integer multiple 
of (n+ 1)!; this leads in the usual way to a contradiction for sufficiently 
large even n. The irrationality of 7? follows on noting that if 7? = p/q, 


then 7,/pq = p. 


2.4 Apply the ideas in the proof of Theorem 2.5 to | f(x) sinh a da. 
0 
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2.5 


2.6 


2.7 


First, prove that 


where 
F(a) = fle) — 4 F"(@) + FO —-- + IOM@). 


Supposing r? = a/b, show that a?” F(1) is an integer divisible by (n+1)! 
and that a?” F'(0) is a non-zero multiple of n!. 


a) Induction. The initial conditions are up = vy = 0, wi = vp = 1. 


c) Use the integral formula to estimate |Jyin+2(r)|. The estimate in- 


( 

(b) First, prove that b"d”u, and b"d"v,, are integers. 
( 

volves a gamma function term, which can be related to a factorial. 
(d) Don’t forget to show that infinitely many J,.,(r) are non-zero. 


(e) Simplifying 
T(k +5) = (k- 5)(k— 5) ++ (S)(S)P(5) 


shows that the series for J,/2(x) and J_j/2(x) are closely related to 
well-known Maclaurin expansions. 


(a) The given series converges for all z, so f is an entire function and has 
a Taylor series centred at any zo € C. The integral formula is a standard 
result of Taylor series. 


(b) Use the given series to confirm the differential equation; substitute 
the other series into it to find 


A(n + 2)(n + Lente + 2(2n + 1)(n + Leng — r?en = 0 
for n > 0, with 


rsinhr 


2 


co=f(1)=coshr and c= f'(1)= 


(c) Find a recurrence for d,,. If even two consecutive d,, are zero, then 
using the recurrence “backwards” shows that all d,, are zero, which is 
not the case. 

(d) The given series shows immediately that |f(z)| < cosh(r|z|!/2). 


(e) If z is on the circle of integration, then |z| < 1+n? < 4n?; estimating 
the integral gives |cn| < e?""/n?”. 
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CHAPTER 3 


3.1 


3.2 


3.3 


3.4 


3.5 


The minimal polynomial is 


f(z) = 2° — G24 — 62° + 1227 —362+1. 


To prove that this polynomial is irreducible we can reduce modulo 3 to 
get f3(z) = z©+1 = (2?+1); this shows that f(z) has no linear or cubic 
factors. It remains to eliminate the possibility that f(z) is the product 
of a quadratic and a quartic. 


Use the triple-angle formula cos3@ = 4cos? 6 — 3cos@ to show that a 


satisfies f(z) = 823 — 6z —1. The value of cos3@ is unchanged if @ is 


replaced by 6+ Ska; so the conjugates of a are cos %, Cos — and cos Bn 


Remember that 1+ ¢+¢?4+¢3+¢4 =0 and ¢° = 1. We have 
a=(4+0, @=C7424+0C 3 aiao=l, 


so a is a root of f(z) = 2? + z— 1. Doing something similar for 6 gives 
the polynomial g(z) = 24 + 2z3 + 4z2 + 3z +1; consider g modulo 2 to 
show that it is irreducible. 


If f(z) = Qn2n + Gn_12""- 1 +++++a1z + a0, then 
n n—-1 n 1 
agz” +a,z + +--+ an-12Z +n = 2 f(=). 


(a) Use De Moivre’s Theorem to find a formula for sinné, then divide 
by cos” @ and take 0 = 1/n to show that tan(a/n) is a root of 


(b) If n = p is prime, then the binomial coefficient (") is a multiple of 
p for 7 = 1,2,...,p—1. Hence J 


jinrr-( 2 jorena()ee() 


is irreducible by Eisenstein’s criterion. 


(c) If n is composite and has an odd prime factor p, then tan(7/p) is a 
root of fn(z), so fp(z) is a factor of f,(z), so f(z) is not irreducible. 
The polynomial fio(z) has degree 8 and it has f5(z) as a factor; the 
quotient 524 — 10z? + 1 is the minimal polynomial of tan(7/10). It can 
be proved irreducible by using exercise 3.4. 
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3.6 


3.7 


3.8 


3.9 


3.10 


Let S = {d € Z| dais an algebraic integer }. If dj,dz are in S, then 
d, — dz is in S; it follows that S is the set of all multiples of some integer 
d, and this d is the denominator of a. 


The cos 36 formula shows that a is a root of f(z) = 2023 — 15z — 4. To 
show efficiently that f(z) is irreducible, use an idea from an example on 
page 4. Find the conjugates of a@ as in exercise 3.2; alternatively, divide 
f(z) by z — a and solve a quadratic. By problem 3.6, dena is a factor 
of 20; confirm that da is an algebraic integer for d = 10 and not for any 
smaller factor. 


1/n 


Write a = a!/" and c= a!/" — b'/": show that the polynomials 


f(z)=z"-a and g(z)=(z-c)"—b 


have rational coefficients, that a is a root of each, and that they have 
no other common roots. Deduce that a is rational. 


If 1, 1/2, V4 are linearly dependent, then ‘2 is a root of a rational 
quadratic. 


(a) Suppose that a and a+r are conjugates; let f(z) be their minimal 
polynomial. Since a is a root of f(z+ 7) we have f(z) = f(z+7r), and 
comparing coefficients gives r = 0. 


Suppose that ( is a repeated root of f(z). Then 6 is a root of the 
derivative f’(z), which is a non-zero rational polynomial; so the minimal 
polynomial g(z) of 8 has degree less than that of f(z) and is a factor of 
f(z). Since f(z) is irreducible, this is impossible. 


Suppose that 
f(2) = g(2)W(2) = (9m2™ +++ g0)(Pim2™ +++ ho) « 


By following the proof of Eisenstein’s Lemma we have (without loss of 
generality) that go,91,---;9n—2 are all multiples of p. But the leading 
coefficient of g(z) is £1, so g(z) has degree n — 1 and h(z) has degree 1. 
Thus f(z) has a rational root. But this is not so. 


a) We have z4 +1 = (22 +. az+1)(z2 —az+1) modulo p. 
b) In this case z4 + 1 = (z? + az — 1)(z? — az —1) modulo p... 


( 
( 
(c) ...and here 24 + 1 = (z* + a)(z? — a) modulo p. 
( 
( 


d) Try the first composite number you think of! 
e) Consider f(z + 1). 


If f(z) = r(z)s(z), then for each k we have r(a,)s(ax) = —1 and there- 
fore r(ax) = —s(ax); hence r(z) + s(z) is a non-zero polynomial with 
n distinct roots, which must have degree at least n; so the assumed 
factorisation of f is trivial. 


3.16 
3.17 


3.18 


3.19 


3.20 


3.21 


3.22 


3.23 
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Consider the polynomial (z — a)(z — 8) = 2? — (a+ B)z + a8. 


If a is not approximable to order s, then for every c, say for simplicity 
c = 1, the inequalities have only finitely many solutions. Now consider 
c’, the minimum over all these solutions of g*|a — p/q|. 


If 
o<|a-2) << 
aq} @ 
then 
a+) < [ral |a—2) < [2a) +c 
qd q 
and so 5 
2 P| 2 alte) 
e|—  (q2)s/2 
For any m, we have 
m 1 Dp 
a= 5 —+R=-=+R 


with q=a°” and0<R< 2/q°. 


Use a “two-dimensional” version of the ideas in the proof of Lemma 3.20. 
Comment. By actually going through the calculations, s = 41 is the 
smallest multiplier that will work. 


For any real s > 0 there exist infinitely many m such that by,41/bm > §, 
and for any such m we can show 


2 
0< at < a with q=a. 
qd q 


Without loss of generality we consider a real number between 0 and 
1, say « = 0.ddd--- (the ds need not be all the same). This can be 
expressed as a sum « = a+ with 


a=0.d---d0---Od---d0--- and B=0.0---Od---d0---0d--- , 


the runs of zeros having lengths approximately 1!, 2!, 3!, 4! and so on. 
Then, essentially, Liouville’s basic proof applies to both a and £. 


(a) To construct a cube twice the volume of a given cube is equivalent to 
constructing a line segment a = ¥/2 times the length of a given segment. 
This is impossible since the degree of ¥/2 is not a power of 2. 


(b) Given two lines separated by angle 6, dropping a perpendicular from 
one to the other, which can be done by ruler and compasses, creates two 


200 @ Hints for exercises 


3.24 


3.25 


3.26 


line segments with lengths in the ratio cos@. When cos@ is rational, 
(2) is a rational polynomial having a = cos $6 as a root; and if f(z) is 
irreducible, then a has degree 3. To find a non-trisectable angle 6 using 
what we have done in this question, we need cos@ to be rational: if we 
try to keep things simple by choosing @ = ra with r rational, there are 
not many options! 


For the third celebrated unsolved construction problem of antiquity, see 
Corollary 5.8. 


Two points { P;, Pz } will not suffice (consider points on the perpendic- 
ular bisector of P; P2). Choose P; = (0,0), P2 = (a,0) and P3 = (0,a), 
where a is to be chosen later. If there exists (x,y) which is not an irra- 
tional distance from any of these points, then 


g+yoar’, («-aPtya=s’?, t+ y-efr =P 


for some rational numbers r, s,t; this is true only if 
Ara? _ (a? 4 7? = se)" ste (a? 4? = vy . 


But we can choose a in such a way that this cannot be true for any 
rationals r,s,t, and this will ensure that a “bad” point (#,y) cannot 
exist. 


Assuming that x is not zero, we have 


a) 


Be 
~ | bx 


the inequality holding since c is not zero, and 


2 2 
2 Ue 3a 
+ai+—=>—. 
o “3 w2—- 4 
Therefore ; 
0< lo - 4 & a ; 
zl ~ [zi 


but a is an algebraic number of degree at most 3, and so in view of 
Roth’s Theorem (or Thue’s) these inequalities cannot hold for infinitely 
many 2. 


Work in the complex plane. Show that if the sides of the p-gon, in 
anticlockwise order, are ag, @1,...,@p—1, then 


Qo + a6 + a6? +++ + api? =0, 


where ¢ = e?7*/P, But ¢ is an algebraic number with minimal polynomial 
L+z+2?+---+2?7!, and hence ap = ay = ag =+++ = dp-1. 


3.27 


3.28 
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For any integer k > 1, let nz be the total number of digits in the integers 
1,2,3,...,10*—1. By following the procedure in the example, show that 


G 
laf — p| < ae 


with gq = (10* — 1)?10"*-1 and 


_  9k10*~1 — 2k 
~ (k= 1)108-1 + 2k * 


or something similar, depending on the exact details of your estimates. 
Then show that t > 9 for all sufficiently large k and t — 9 as k > ov, 
so that € is approximable to order 10 (but not to any higher order: at 
least, not by using this method). 


Liouville’s Theorem shows that € is not algebraic of degree less than 10, 
Siegel shows € is not algebraic of degree less than 25, and Roth shows 
that € is transcendental. 


Let s > 0; for any m > 2s, define 


Dm = (2° at 1) ao as 1) .. (2 a 1) 
dm = glit2l+--+-m! 
Pm _ 74 1 
mat fis d) 
k=1 


and then use the suggested inequality to show that 


2a 2a 


Pm 
a(mtiy! < , 


0<|a— 24) < 
dm 


To prove the inequality, we can use a method suggested by Lovro Soldo: 
show by induction that all 2’ terms obtained by expanding 


m+t 
1 
I] (1+ 52) 
k=m+1 
are different, so that 
m+t 
1 1 
II (1 + =) =1+ a(m+1)! ee 
k=m-+1 
1 1 1 
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CHAPTER 4 
vil 
4.1 (a) [3,1,4,1,5,9]; (b) [@,1,1,2a,1,1,2a,...]; (c) ov 
4.2 (a) Use the Euclidean algorithm as usual, noting that the recurrence for 


4.3 


4.4 


4.5 


4.6 


4.7 


4.8 


Pr gives the quotients and remainders, py = axpr—1 + Pr—2 and so on: 
you should get 


Pn 1 1 1 
Pn-1 Qn—1+ ++: 41+ ao , 


(b) For the “if” proof, use the uniqueness result, Lemma 4.3. 
Prove by induction that primdk — Pkdk+m = (—1)*Qm, where 


Py 4 1 1 
= Ak41 + —— . 
Qm fs an+2+ +++ Akim 


Show that p,, and q,, defined in this way satisfy the same recurrences 
and initial conditions as p, and q, defined in the usual way. 


Pk Pk-1 


By induction, the product is C . ); now take determinants. 
k Uk-1 


Use exercise 4.2 or exercise 4.5 to evaluate the continued fraction as 
PR + Pia 
DPkdk + Dk—-19k—-1 | 


This fraction is in lowest terms since any prime common factor of the 
numerator and denominator is also a factor of 


(pe + Pe—1)dk — De(PROk + Pk—-19k—1) = EPK—1 
and hence also of px. 


Evaluate a from the relation 


1 1 1 
@it+ +++ Gnit ata’ 


Qa=ag+ 
and use problem 4.2(b). 
(a) If a has convergents px/q,, the minimal polynomial is 


flz)= Qn—12" + (Qn—2 — Pn—1)% — Pn—2 } 


evaluate f(—1) and f(0). 


(b) Induction, using af, = 1/(aZz — ax). 


4.9 
4.10 


4.11 
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(d) Note that 
1 1 1 


Amot+1 +osee Amo+n-1 + Amo + ose 


Amo = Amo + = Amo+n 


and use (c). Starting with the fact that @m4n =m is true for m > mo, 
we have shown that it is true for m > mo — 1; repeating the argument 
mg times shows that it is true for m > 0, and this is what we wanted. 


Use Theorem 4.13. 
(a) Find a formula for gp /qn—1 as in exercise 4.2; hence show that 


1 l f.3 : In = In-1 
——— — with « = ———.. 
ay + +++ Gn-1+ GQn+ & dn 


m 
(b) If By = x, —, is the mth partial sum of the series, we can write 

k=0 

Pm qm 
m=—, Gm=10° , gcd(pm,dm)=1. 
m 

We can then use (a) to find the continued fraction of 841 from that of 
Bm. Repeating the procedure indefinitely gives the partial quotients of 
G, though we must note that the last partial quotient at each step (and 
only the last) will change at the succeeding step. This yields 


a =e: g+2+ g+ g+ g-2+ g+ gt+2+-:: 


1 1 
bp + bot ::: 


? 


where b} = g—1, bp = g + 2, b3 = g; and box = g, box4, = g — 2 for 
k > 2; and 


bstlepen Hb for k> Dand 133,22" 1, 


Applying this recursion gives b2345 = bi752 =--: = bop = g — 2. 

(c) We know that £ is irrational and hence is approximable to order 2 or 
more. Now look carefully at how we proved Theorem 4.16. Comment. 
We shall prove in Chapter 6 that ( is transcendental. 


(d) We can do something similar for g = 2, but we need a version of (a) 
which applies when a, = 1. Specifically, if a, = 1, then 
Dn (—1)” 
in @ 


= [0,@1,---;@n—2,;An—1,An—1 +2,dn—2,..-,@1| . 


Prove that the partial denominators of a satisfy q, < 107". 
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4.12 


4.13 
4.14 


(a) First, write 


m mq —np—npB 
|qa — p— Bl < |a(a-=)] + i 


Choose m/n to be an appropriate convergent to a. Then use the Bézout 
property, Lemma 1.11, to show that there exist integers p,q such that 
n<q< 2n and |mq —np—nf| < 1. This will give 


3 
goe—p=—6|<— 5 
n 


now go back to the choice of m/n and work out what “appropriate” 
should mean. Explain carefully why we shall obtain infinitely many pos- 
sibilities for (p,q). 


(b) Given 6, 2, use Kronecker’s Theorem with suitably chosen values 
of @ and ¢ to prove the stated result. Do the converse too. 


(c) Rephrasing the question, we wish to show that for any positive in- 
teger n there exist positive integers p,q such that 


n lO? < 24 < (n+1)10? , 


that is, 
logn < qlog2—p < log(n +1), 


where log denotes the logarithm to base 10. This follows quite easily 
from (b), though a little care is needed to ensure that p is positive. 


Calculate 7+ = [97, 2, 2, 3, 1, 16539, ...]. 


Suppose that the inequality fails for both pp_1/qx—1 and px / qr, and note 


(2 ) ( mt)! | Pk 
—-—a)+(a-—-——}]] =|a-— 
dk dk-1 qk 


equality holding since the two bracketed expressions have the same sign. 


Pk-1 
dk-1 


+ |a- 


We may calculate 


Ve 1 1 1 1 1 
SSSI) i eee eee eee ee : 
¢(12) 1+ 34+ 14 24 24 14 13+ ::: 
..and if you are not tempted by the 13, the next partial quotient but 
one is 82225, giving the conjecture 


mi? 638512875 
¢(12) i 


The best possible A, is Yn? +4, the proof being essentially the same 
as for Hurwitz’ Theorem. 


4.17 


4.18 


4.19 


4.20 
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The approximate location of the roots should be obvious from the first 
formula for f(z). Applying the procedure of section 4.7 (with computer 
assistance if desired) gives the middle root as 


1 i 1 1 1 1 


1 ae ae . 
"pa ee Fe Se Wee 1 vc 


From problem 3.5, the minimal polynomial is f(z) = z+— 1022 +5, and 
we obtain the continued fraction 


1 1 1 1 1 L 1 1 il 1 1 1 


for a. Hence the required inequality is satisfied by 


Pp = Pil 46041 


qd q11 _ 63370 , 


1 1 
Very similar to Theorem 4.22. Begin with coth as instead of coth 5 
P 


Any p/q is either a convergent to a or not: consider the two cases sep- 
arately. If p/q¢ is not a convergent, use Theorem 4.13 to find a constant 
c, such that 
Pp C1 
a-=|>=——, 
| | 7 logq 
provided that q > 1. 


If p/q is a convergent pz/qx to a, first prove that q, > 2*/?. Dispose 
of the one possible exception to this inequality by recalling that q 4 1. 
Then take the equality in equation (4.5); estimate a,+1 in terms of k, 
and hence in terms of log q,, to show that 


C2 


Pry) SL __2 
Gg, log dk 


qk 


a-— 


for some cg. Then the required result holds with c = min(c1,c2). If a 
is approximable to order s > 2, then q°~? is less than a constant times 
log q for arbitrarily large g, which is impossible. All of this applies to e 
since the partial quotients of e satisfy a, < k whenever k > 1. 


By manipulating series we can write the modified Bessel function in 
terms of the function f(c;z) from section 4.8, 
2 


Loh = ey) He) 


If we now take appropriate values for v and x, then Theorem 4.19 yields 
1 1 1 1 — Io(2) 


i = 
24 34 44 54 AQ) 
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4.22 


4.23 


To evaluate I;,, write one of the factors x in the integrand as (a —1) +1 
and split into two integrals. For K,, integrate by parts; for Jz, use 
both these techniques. With careful work, this will give the recurrence 
relations 


ip=JeP Kpaiy dy = Qh + Api + dpe, Ke === des 
If k > 0 we have a3p42 = 2k + 2 and agp43 = a3p4+4 = 1; this leads to 


r3r¢2 = (2k 4+ 2)rge41 + 13K 


T3k+3 = T3k+2 1 T3k41 


T3k+4 = 1T3k+3 1 T3k+2 - 


Use all this information to prove the required formulae simultaneously 
by induction. A straightforward estimate shows that the integrals all 
tend to zero, and it follows that 


: Pm 
a= lim —-=e. 
m—- oo dm 


A comprehensive exposition of the origin of these integral formulae in 
the work of Hermite is given by Cohn [20]. 


(a,b) We have de = 1+k(p—1)(q—1) for some integer k > 1; show that 


@: ok k p+q 1 

ae ae 

and use Theorem 4.13. This means that we are seeking a convergent 
with gq, < ani/ 4. Since the partial denominators of a continued fraction 
increase more or less exponentially, q, ~ b*, the maximum number of 
attempts needed will be something like 


_ lL logn 
~ 4 logb ° 


Even if n is about 102°°, we have only 50 or so possibilities to check 
—not much work with suitable software. (Of course the exact details 
depend on the value of b; but we only intend to give a rough argument.) 


(c) We calculate the continued fraction 


e 1 1 1 1 


n 494+ 74+ 44 2344 ---’ 

noting that the partial quotient of 234 means that the gz are increas- 
ing very rapidly and we should have a short search. The corresponding 
convergents are 


p 1 7 29 ~~ 6793 


q 49’ 344’ 1425’ 333794 ° 
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we can stop here as the fourth denominator is already too large. Now 
we should have c*° = ¢ (mod n) for every c; taking c = 2 for simplicity, 
we calculate 


ee = 961918896598014, 2°°4° = 194153471695549, 2° =2. 


Thus only d = 1425 remains as a possibility; we can now find p and q 
to confirm that this possibility is in fact correct. 


This method of breaking a carelessly set up RSA code is known as 
Wiener’s attack; more information can be found in [69]. 


4.24 Let a have partial quotients a, and complete quotients a;. At the kth 
step we remove a,x squares, leaving a rectangle of size 
1 1 
1A2°°* Me+1 A1AQ°** Ak 


(which is similar to a rectangle of size 1 x ax+41). Prove this by induction. 


4.25 Since we are told that Holmes “glanc[ed] at his watch”, it appears rea- 
sonable to assume that he observed the passage of (say) p telegraph 
posts in (say) g seconds. It seems improbable that he could achieve any 
greater accuracy than a single post, or a single second, so we assume 
that p and q are both integers. Converting a speed of p posts in q sec- 
onds to miles per hour and noting that Holmes gives the speed to the 
nearest half mile per hour (or better), we need 

os 535] ee. Waits. 2 = ma ges 
1lq 2 4 q 2700 360 
so we look at the convergents of 1177/2700. The first sufficiently ac- 
curate convergent is p/q¢ = 7/16, the next is p/q = 17/39. The most 
plausible conclusion would seem to be that Holmes observed 7 telegraph 
posts in 16 seconds, and then calculated (7/16) x (1350/11) in his head. 
Elementary! 


4.26 The figure of 29.41% suggests an extensive study involving maybe 10000 
simulations. An investigation employing continued fractions might lead 
to the view that the actual number of tests performed was consider- 
ably smaller than this, and that the results of the simulation cannot, 
therefore, be considered very reliable. 


CHAPTER 5 
5.1 If the sides of the rectangle are 2x” and 2y we get 
4 2 
es : a where pes ee 
T 1? t+2 y «x 


and this gives a quadratic for 7. 
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5.2 


5.3 


5.4 


5.6 


5.7 
5.8 


5.9 


Answer: et _ Aeze9 + 8e ,e3 — 16e4. 


n 


Write f(z) = I[¢ + ax); then each side of the equation is f’(1)/f(1). 


k=1 
Let a1,...,@, be complex numbers and write 8, = ex(a1,...,Qn), 
where e; is the kth elementary symmetric polynomial in n variables. If at 
least one of a1,...,@,, is transcendental, then at least one of 31,..., By 


is transcendental. 
(a) Writing ¢ = e?7*/3, the conjugates are 


ay = V2+ 73, ag = -V24+ 73, 
a3 =V2+ 73¢, a4 =—-V24+ ¥3¢, 
ap =V2497380, ag=—V24+ 730. 


(b) The values aj + ag and so on consist of a complete set of conjugates 


{+2/2- Vac" | k=0,1,2} 


with 6 elements; a complete set of conjugates { 2473, 1990. 0730 }; 
and the set {—¥/3, —W3¢, —W3C?} which occurs twice. 


(c) Here there is a complete set of conjugates { /2 + V3 + 21/3¢,...} 
with 12 elements; and { 3/2, —3\/2}; and { 2, —V/2} three times. 


If a is algebraic with denominator d and e® is algebraic, consider e@. 
Using the fact that a; + a2 + a3 = 0, we have 

p(e* )p(e%) p(e®*) = —(e™ + + e%) + (e7% 4 e7 +e), 
and none of these terms is an integer. 
To show that (1) implies (2) rewrite the equation 3 e% + B2e° = 0 as 


O45 = _ pe 
By’ 


assuming that 6, 4 0. For the converse, consider 1(e*) + (—e%)e° = 0. 


e 


If a is algebraic and tana = {, then ia is algebraic and 
_1+i8 
~ 1-iB- 


Comment. There is no requirement in this question that a be real. 


(er? 


5.11 


5.12 
5.13 


5.14 
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Evaluating the integral and the sum leads to c? + (loge — 2)c +1 = 0, 
and upon substituting c = e~* this becomes e?* — (A + 2)e* +1 = 0. 
Writing the equation as 4 = —2 + 2coshA makes it easy to see that 
there is a unique positive solution for \. If \ is algebraic, then e% is a 
root of a quadratic equation with algebraic coefficients and is therefore 
also algebraic: this is impossible. 


First prove a irrational, then use the Gelfond—Schneider Theorem. 


Apply the Gelfond-Schneider Theorem to a’ and show that at least one 
of cos(loga@) and sin(loga) is transcendental. But an exceedingly well 
known relation between these numbers guarantees that it is impossible 
for only one of them to be transcendental. Comment. For readers who 
are comfortable with complex logarithms, this result is easily extended 
to any complex algebraic a except for 0 and 1. 


Statement (1) is equivalent to the following: if0 < a < $ and a/(1—2a) 
is an algebraic irrational, then 2 cosa is transcendental. Prove that (2) 
implies this statement. 


Conversely, let 3 be an algebraic irrational and suppose without loss of 
generality that 0< 6B < 4; consider an isosceles triangle with base angle 
a = 7, and deduce (2) from the above restatement of (1). 


Statement (2) follows from the Gelfond—Schneider Theorem by taking a 
suitable value of a. 


CHAPTER 6 


6.1 


6.2 


If dy is the denominator of 6;, then (d181)(d282)---(dmGm) is an al- 
gebraic integer, and this gives an estimate for den(61(2---6,,); fur- 
thermore, every conjugate of 6182---8, is a product of conjugates 
of 21, 02,..., 2m. For an example where equality does not hold, take 
B, = V2 and let 2 be an algebraic number, not an algebraic integer, 
which is closely related to it. 


To show that ||, + 2|| < |]61|| + ||S2|| does not always hold, we need an 
example where a common denominator for (1, 32 is greater than their 
individual denominators. Try thinking about some simple quadratic ir- 
rationals. 


The result is obvious when |a| > 1. If Ja] < 1, note that 


ag = —a10: — aga” —---— ana” , (8.1) 


so 1 < H(f)(\o| + |a|? +---+ a"), and it is easy to estimate the sum 
in brackets. To solve the second part of the question, turn the above 
inequality into an equality by choosing an a for which all the terms on 
the right-hand side of (8.1) are real and of the same sign. 
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6.3 


6.4 


6.5 


6.6 


(a) Consider the behaviour of |p(z) /z*| in the limit as |z| tends to infinity. 


(b) If e* is not transcendental, then there is an algebraic relation which 
can be written 


Pm(z)e™ = pm—1(z)eO™Y? +++» + po(z) 


and which holds for all z. Obtain a contradiction by showing that when 
|z| is large, the left-hand side is greater than the right-hand side in 
absolute value. 


The function - 
ad 
k=0 
satisfies the functional equation f(z) = (1+ z)f(z'°), and f(z) can 


be proved transcendental using the lemmas on page 173; therefore the 
conditions of Theorem 6.6 apply, and a is transcendental. 


Here the relevant power series f(z) has coefficients which satisfy 


Ark = Ork41 = Grk42 = *'* = Apkt+(r—1) = Ak 
for k > 1, and the functional equation is 
Fa)=At2b2% te $20 FZ") + (242% tote" 8). 
The power series coefficients a, satisfy 
a, =1 and agx = agn41 = 20x fork >1. 


Standard series convergence tests show that f is analytic for |z| < 1. 
Derive the functional equation 


f(z) = (2+ 22) f(z’) +2, 


and conclude that the conditions of Mahler’s transcendence theorem are 
satisfied. It remains to prove that f is a transcendental function. First, 
show that f is not a rational function (noting that the first lemma in 
appendix 2.4 is not applicable). Suppose that f(z) = p(z)/q(z), where 
p and q are polynomials of degrees m and n respectively. Write the 
functional equation in terms of p and q, and use it to show that m = n. 
By comparing coefficients of suitable powers of z, it follows that either 
Pm OF Gm is zero, which is not true. Thus f is not a rational function, 
and the second lemma in appendix 2.4 shows that f is transcendental. 


6.7 We have 


(1 — z)f(z) = 2(1— 2) 4 227(1 — 27) +. 3241 — 24) 4 --- 
=zgt27+2t428+.... 


6.8 


6.9 
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Show that the power series coefficients a, satisfy 
Q3k—-1 = 43k = 43k+1 = Gk 
for k > 2, and hence obtain the functional equation 


= 1l+z24+2? 


iP) pe oe ae 
z 


f(z) 
This does not satisfy the conditions of our “functional equations and 
transcendence” theorem; however, considering a closely related function 
g(z) will give a functional equation without that annoying z in the de- 
nominator. 


Clearly Theorem 6.6 does not apply in this case, but we can follow very 
much the same method. Let f(z) and ¢ be as stated, and suppose that 
f(¢) and f(¢?) are both algebraic. By assumption there is a Taylor series 


f(Z= Ss axz* for |z|<1; 
k=0 


substituting this into the functional equation and extracting the coeffi- 
cient of z™ on both sides gives 


Ama if4|m 
Gm + Q@m—1 + Q1@m—2 + G14m—3 + = . 
0 otherwise, 


and it follows by induction that all the coefficients a, are integers. By 
iterating the functional equation we obtain 


Fe") =O(2)F 7 ?)* 
for t > 0, where the ¢; are the Fibonacci numbers and 


t-1 


(2) =]][(i+2”)” 


j=l 


is a polynomial with integer coefficients. Show that ©;(z) has degree less 
than a constant times 2’. Construct an auxiliary function E(z) exactly 
as we have done throughout Chapter 6, and estimate |F (i. Show 
that E(¢?') is an algebraic number whose degree is bounded indepen- 
dently of s and t. Considering E(¢ =) as a polynomial in three variables 
evaluated at z1 = ¢, z2 = f(¢), 23 = f(C?), estimate its algebraic size 
by means of Corollary 6.3. Finally, show how to choose s and ¢ so that 
the estimates we have made contradict the fundamental inequality for 
algebraic numbers. 
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| C) 
> O— =+@)_ ou 
Figure 6.1 A deterministic finite automaton for exercise 6.10. 


6.10 A suitable DFA is shown in figure 6.1. Using the method of section 6.5, 
observe that the characteristic function of S is f(z) = f3(z). If we also 
write g(z) = f2(z), then we can show that 


Zz 


f(z) = (L+2)f(2*) + 29(2") , g(2) = -2f(2*) — 29(2") + 


CHAPTER 7 


7.1 It is convenient to also define ap = 0 and bp = 1; then we have 


Pe Pr-1\ _ (a 1\ fai 1). fax 1 
dk Wk-1 bo OF \bi 0 b, O} * 
7.2 Induction. The smart way to prove the result for qx+1 is to note that 
dk + qr-1 = (k + 1)!. 


7.3 The nth convergent to the continued fraction satisfies 


n 


Ly eae 


dn Fi dk dk-1 


Now find a formula for q;, use (7.5) and let n — oo. 


7.4 Use the ideas of the previous problem. You should find that q, = k! 
for the first continued fraction, and q, = (2k — 1)(2k — 3)---1 for the 
second. From the continued fraction for tan~' a we obtain 

4 1 9 2 


0 ll —— 


5 
1+ 2+ 2+ 24.---0 


7.5 (a) Use the fact that q, > qx—1 to show that qx > 2qr—1, then use this 
to show that gq, > (2+ bk) dk—-1: This gives rather more than was asked 
for. 


(b) It is not hard to show that the inequality is true for k = 2, so look 
for an example with k = 3. 
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7.6 Let r? = s/t. Informally, we find the continued fraction 


S st st st 
rtanr = — 


t 3t 5t Tt 


So, following the procedure of section 7.2.1, define suitable az and bz and 
deduce that px /q, has an irrational limit as k + oo. To prove rigorously 
that the limit is rtanr, show by induction that 


oo gk (m a k)! gmtk+l 
Tk Sin T — pp COST a | Qm+2k+1)) eH 


for k > 0; the rest of the proof is very similar to that for tan r. 


7.7 Use induction to show that the convergents to the continued fraction 
are given by 


(k+1)!2*+1 


= [(2k + 1)(2k — 3)---]? 
Pk = TE 1)Qk— 5) a 


and gq, = PL 


for k > 0. Some careful algebra gives 


4 A 2k 2k 2k+2 


Pk _ 2 ae 
3 5 2k-—12k+12k4+1’ 


2 
= xox 
dk 1 3 


which in the limit as k > oo is Wallis’ product for 7/2. 


This problem appeared in the American Mathematical Monthly in 2004; 
the above hints are adapted from the solution in [53]. 
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